next up previous
Next: Gaussian Test Case Up: No Title Previous: Introduction

Theory

Before examining the effect of statistical errors in histogram reweighting, it is instructive to review our understanding of errors in MC simulations. Because of the finite number of measurements, any quantity measured in a simulation will suffer from statistical and systematic errors. [4] This is further complicated by the fact that the measurements are not, in general, independent. The first careful study of statistical errors in MC simulations was performed by Müller-Krumbhaar and Binder [2] more than 20 years ago. They considered the statistical error in the average value of some quantity f measured in a simulation. If is the value of f at the step of the simulation, the average value of f, <f>, computed from a simulation consisting of N measurements (after discarding a sufficient number of measurements for equilibration), is

 

To calculate the statistical error in <f>, Müller-Krumbhaar and Binder started with the expression for the variance of a sum of N correlated random variables[8]

and related the covariance term to a sum of time-displaced averages. Their final expression is traditionally written as

 

where is the integrated correlation time[9] for the quantity f

and is the time-displaced autocorrelation function

Our description of statistical errors in histogram reweighting will follow the Müller-Krumbhaar--Binder formalism rather closely. To see how this is possible, we first point out that MC data can be reweighted without using histograms. Consider a MC simulation performed at . The average value of some quantity calculated using the single histogram method is

 

where and E is the total energy of a configuration. The histogram is constructed from the time sequence of energies generated during the simulation

 

where is the Kronecker delta function and the sum runs over the N measurements made during the simulation. By inserting the definition of (5), into (4), and performing the sum over E first, we get the equation for ``reweighting on the fly'', or reweighting without histograms.

 

When , this reduces to the standard expression for the average of a quantity (1). The ``reweighting on the fly'' approach is useful for analyzing data requiring a multi-dimensional histogram, or for continuous systems to avoid the need to bin the data.

To simplify the formalism, we define a ``curly-bracket'' notation for averages that include the reweighting factor . Each term inside the carries along a reweighting factor. Examples of this notation are:

Note that these averages can also be calculated using the corresponding histograms. The single-histogram equation itself (4) expressed in this notation is

The analysis of errors is complicated by the fact that once reweighting has been performed, both the numerator and denominator in (6) will suffer from statistical error (in (1), the denominator is simply the number of measurements, N, which has no error). We represent the square of these errors by and respectively. In addition, we expect that the error in the numerator is correlated with that of the denominator because both are calculated from the same set of measurements. It is important to note that this correlation is present even if there is no correlation between measurements during the simulation.

If and were independent, the square of the statistical error in would be given by

or

 

which is the standard expression for the propagation of error in a function of two independent variables [10]. However, because and are not independent, (7) is not correct, and in fact overestimates the true error. To properly take this correlation into account, we must include the covariance . The correct expression for the square of the statistical error in is then given by

or

The square of the relative error in takes on a particularly simple form:

To facilitate this derivation, let us consider the covariance of two arbitrary functions R and Q which have the same form as and :

From this, we can then easily calculate , and by replacing the functions r and q with f and 1 appropriately. By generalizing the analysis of Müller-Krumbhaar and Binder, we can define the covariance as

 

The double sum over i and j can be replaced by a single sum of time-displaced averages

 

where t is the time displacement index. The covariance (9) is thus expressed as

 

To complete the generalization of the Müller-Krumbhaar--Binder formalism, we define a reweighted time-displaced cross-correlation function

and a reweighted correlation time

 

to finally obtain

 

With the proper substitutions for r and q in (13) we can now evaluate , and :

and the relative error in f

 

The error in a thermodynamic quantity like the energy can be obtained simply by replacing f with E in (14). To compute the error in a response function, for example the specific heat, we need to find the appropriate function whose average value gives us the desired quantity. For the specific heat, the function is

so that the specific heat C is given by

The quantities of interest are then:

The time-displaced correlation functions are quite complex. For example, to calculate

for the specific heat, we need which is is given by

and which is given by

where

A different approach to calculating the error in C is to make two independent passes through the data, calculating in the first pass, then directly evaluating in the second. We found that this second approach is easier to implement, and is more stable numerically.

The expressions for the error have two different kinds of terms: some that depends on the simulation algorithm used, containing the reweighted correlation times, and others that represent equilibrium averages and are therefore independent of the simulation algorithm. However, unlike the non-reweighted case, we cannot simply factor out the correlation time dependence; this will lead to non-trivial differences in how the error increases with when we change from one simulation algorithm to another! Examples making use of the formalism developed here are given in the next two sections.



next up previous
Next: Gaussian Test Case Up: No Title Previous: Introduction



root
Thu Jun 22 14:26:19 EDT 1995