Analysis functions¶
Calorine provides a convenience function for calculating averages and errors over a time series of data (analyze_data()
and others).
Furthermore, several functions are available for plotting results (see, e.g., calorine.analysis.plot_kappas_with_average()
).
These functions are primarily intended to be used for quick visualizations, e.g., in a jupyter notebook:
plot_kappas_with_average(kappas)
There is also a function to obtain averages over many runs (calorine.analysis.get_run_average()
), which takes care of computing correlation lengths and producing error estimates:
get_run_average(kappas, nequil=300)
Module¶
- calorine.data_analysis.analyze_data(data, max_lag=None)[source]¶
Carries out an extensive analysis of the data series.
- Parameters
data (
ndarray
) – data series to compute autocorrelation function formax_lag (
Optional
[int
]) – maximum lag between two data points, used for computing autocorrelation
- Returns
calculated properties of the data including, mean, standard deviation, correlation length and a 95% error estimate.
- Return type
dict
- calorine.data_analysis.get_autocorrelation_function(data, max_lag=None)[source]¶
Returns autocorrelation function.
The autocorrelation function is computed using pandas.Series.autocorr.
- Parameters
data (
ndarray
) – data series to compute autocorrelation function formax_lag (
Optional
[int
]) – maximum lag between two data points
- Return type
calculated autocorrelation function
- calorine.data_analysis.get_correlation_length(data)[source]¶
Returns estimate of the correlation length of data.
The correlation length is taken as the first point where the autocorrelation functions is less than \(\exp(-2)\). If the correlation function never drops below \(\exp(-2)\)
np.nan
is returned.If the correlation length cannot be computed since the ACF is unconverged the function returns
None
.- Parameters
data (
ndarray
) – data series for which to the compute autocorrelation function- Return type
correlation length
- calorine.data_analysis.get_error_estimate(data, confidence=0.95)[source]¶
Returns estimate of standard error \(\mathrm{error}\) with confidence interval.
\[\mathrm{error} = t_\mathrm{factor} * \mathrm{std}(\mathrm{data}) / \sqrt{N_s}\]where \(t_{factor}\) is the factor corresponding to the confidence interval and \(N_s\) is the number of independent measurements (with correlation taken into account).
If the correlation length cannot be computed since the ACF is unconverged the function returns
None
.- Parameters
data (
ndarray
) – data series for which to estimate the error- Return type
error estimate
- calorine.analysis.get_run_average(kappas, nequil)[source]¶
Computes averages over several simulations and returns mean, error estimate and correlation length in the form of a dictionary.
- Parameters
kappas (
List
[DataFrame
]) – list of dataframes with data read fromkappa.out
filesnequil (
int
) – number of data points to drop in the beginning to account for equilibration
- Return type
dict
- calorine.analysis.plot_kappas_distribution(kappas, nequil=0, title='')[source]¶
Generates an overview figure of the thermal conductivity data from a series of runs.
- Parameters
kappas (
Dict
[str
,DataFrame
]) – dictionary with each entry containing a dataframe read from akappa.out
filenequil (
int
) – number of steps to drop in the beginning of each trajectory to account for equilibrationtitle (
str
) – title string (optional)
- Return type
None
- calorine.analysis.plot_kappas_split(kappas, nsim_max=10, title='')[source]¶
Generates an overview figure of the thermal conductivity data from a series of runs.
- Parameters
kappas (
Dict
[str
,DataFrame
]) – dictionary with each entry containing a dataframe read from akappa.out
orhac.out
filensim_max (
int
) – maximum number columns (=runs) to showtitle (
str
) – title string (optional)
- Return type
None
- calorine.analysis.plot_kappas_with_average(kappas, title='')[source]¶
Generates an overview figure of the thermal conductivity data from a series of runs.
- Parameters
kappas (
Dict
[str
,DataFrame
]) – dictionary with each entry containing a dataframe read from akappa.out
orhac.out
filetitle (
str
) – title string (optional)
- Return type
None
- calorine.analysis.plot_thermos_split(thermos, nsim_max=10, title='')[source]¶
Generates an overview figure of the thermodynamic data from a series of runs.
- Parameters
thermos (
Dict
[str
,DataFrame
]) – dictionary with each entry containing a dataframe read from athermo.out
filensim_max (
int
) – maximum number columns (=runs) to showtitle (
str
) – title string (optional)
- Return type
None
- calorine.analysis.plot_thermos_with_average(thermos, title='')[source]¶
Generates an overview figure of the thermodynamic data from a series of runs.
- Parameters
kappas – dictionary with each entry containing a dataframe read from a
thermo.out
filetitle (
str
) – title string (optional)
- Return type
None