# Analysis functions¶

Calorine provides a convenience function for calculating averages and errors over a time series of data (analyze_data() and others).

Furthermore, several functions are available for plotting results (see, e.g., calorine.analysis.plot_kappas_with_average()). These functions are primarily intended to be used for quick visualizations, e.g., in a jupyter notebook:

plot_kappas_with_average(kappas)


There is also a function to obtain averages over many runs (calorine.analysis.get_run_average()), which takes care of computing correlation lengths and producing error estimates:

get_run_average(kappas, nequil=300)


## Module¶

calorine.data_analysis.analyze_data(data, max_lag=None)[source]

Carries out an extensive analysis of the data series.

Parameters
• data (ndarray) – data series to compute autocorrelation function for

• max_lag (Optional[int]) – maximum lag between two data points, used for computing autocorrelation

Returns

calculated properties of the data including, mean, standard deviation, correlation length and a 95% error estimate.

Return type

dict

calorine.data_analysis.get_autocorrelation_function(data, max_lag=None)[source]

Returns autocorrelation function.

The autocorrelation function is computed using pandas.Series.autocorr.

Parameters
• data (ndarray) – data series to compute autocorrelation function for

• max_lag (Optional[int]) – maximum lag between two data points

Return type

calculated autocorrelation function

calorine.data_analysis.get_correlation_length(data)[source]

Returns estimate of the correlation length of data.

The correlation length is taken as the first point where the autocorrelation functions is less than $$\exp(-2)$$. If the correlation function never drops below $$\exp(-2)$$ np.nan is returned.

If the correlation length cannot be computed since the ACF is unconverged the function returns None.

Parameters

data (ndarray) – data series for which to the compute autocorrelation function

Return type

correlation length

calorine.data_analysis.get_error_estimate(data, confidence=0.95)[source]

Returns estimate of standard error $$\mathrm{error}$$ with confidence interval.

$\mathrm{error} = t_\mathrm{factor} * \mathrm{std}(\mathrm{data}) / \sqrt{N_s}$

where $$t_{factor}$$ is the factor corresponding to the confidence interval and $$N_s$$ is the number of independent measurements (with correlation taken into account).

If the correlation length cannot be computed since the ACF is unconverged the function returns None.

Parameters

data (ndarray) – data series for which to estimate the error

Return type

error estimate

calorine.analysis.get_run_average(kappas, nequil)[source]

Computes averages over several simulations and returns mean, error estimate and correlation length in the form of a dictionary.

Parameters
• kappas (List[DataFrame]) – list of dataframes with data read from kappa.out files

• nequil (int) – number of data points to drop in the beginning to account for equilibration

Return type

dict

calorine.analysis.plot_kappas_distribution(kappas, nequil=0, title='')[source]

Generates an overview figure of the thermal conductivity data from a series of runs.

Parameters
• kappas (Dict[str, DataFrame]) – dictionary with each entry containing a dataframe read from a kappa.out file

• nequil (int) – number of steps to drop in the beginning of each trajectory to account for equilibration

• title (str) – title string (optional)

Return type

None

calorine.analysis.plot_kappas_split(kappas, nsim_max=10, title='')[source]

Generates an overview figure of the thermal conductivity data from a series of runs.

Parameters
• kappas (Dict[str, DataFrame]) – dictionary with each entry containing a dataframe read from a kappa.out or hac.out file

• nsim_max (int) – maximum number columns (=runs) to show

• title (str) – title string (optional)

Return type

None

calorine.analysis.plot_kappas_with_average(kappas, title='')[source]

Generates an overview figure of the thermal conductivity data from a series of runs.

Parameters
• kappas (Dict[str, DataFrame]) – dictionary with each entry containing a dataframe read from a kappa.out or hac.out file

• title (str) – title string (optional)

Return type

None

calorine.analysis.plot_thermos_split(thermos, nsim_max=10, title='')[source]

Generates an overview figure of the thermodynamic data from a series of runs.

Parameters
• thermos (Dict[str, DataFrame]) – dictionary with each entry containing a dataframe read from a thermo.out file

• nsim_max (int) – maximum number columns (=runs) to show

• title (str) – title string (optional)

Return type

None

calorine.analysis.plot_thermos_with_average(thermos, title='')[source]

Generates an overview figure of the thermodynamic data from a series of runs.

Parameters
• kappas – dictionary with each entry containing a dataframe read from a thermo.out file

• title (str) – title string (optional)

Return type

None