NEP interface#

Training IO#

calorine provides a number of functions for preparing input files for training NEP models, including in particular the setup_training function. There are also several functions for analyzing the training process, including, e.g., the read_loss, read_structures, and get_parity_data functions.

calorine.nep.get_parity_data(structures, property, selection=None, flatten=True)[source]#

Returns the predicted and target energies, forces, virials or stresses from a list of structures in a format suitable for generating parity plots.

The structures should have been read using read_structures, such that the info object is populated with keys of the form <property>_<type> where <property> is, e.g., energy or force and <type> is one of predicted or target.

The resulting parity data is returned as a tuple of dicts, where each entry corresponds to a list.

Parameters:

structures (list[Atoms]) – List of structures as read with read_structures.
property (str) – One of energy, force, virial, stress, bec, dipole, or polarizability.
selection (list[str]) – A list containing which components to return, and/or the norm. Possible values are x, y, z, xx, yy, zz, yz, xz, xy, norm, pressure.
flatten (bool) – if True return flattened lists; this is useful for flattening the components of force or virials into a simple list

Return type:

DataFrame

calorine.nep.read_loss(filename)[source]#

Parses a file in loss.out format from GPUMD and returns the content as a data frame. More information concerning file format, content and units can be found here.

Parameters:: filename (str) – input file name
Return type:: DataFrame

calorine.nep.read_nepfile(filename)[source]#

Returns the content of a configuration file (nep.in) as a dictionary.

Parameters:: filename (str) – input file name
Return type:: dict[str, Any]

calorine.nep.read_structures(dirname)[source]#

Parses the output files with training and test data from a nep run and returns their content as two lists of structures, representing training and test data, respectively. Target and predicted data are included in the info dict of the Atoms objects.

Parameters:: dirname (str) – Directory from which to read output files.
Return type:: tuple[list[Atoms], list[Atoms]]

calorine.nep.setup_training(parameters, structures, enforced_structures=[], rootdir='.', mode='kfold', n_splits=None, train_fraction=None, seed=42, overwrite=False)[source]#

Sets up the input files for training a NEP via the nep executable of the GPUMD package.

Parameters:

parameters (NamedTuple) – dictionary containing the parameters to be set in the nep.in file; see here for an overview of these parameters
structures (List[Atoms]) – list of structures to be included
enforced_structures (List[int]) – structures that _must_ be included in the training set, provided in the form of a list of indices that refer to the content of the structures parameter
rootdir (str) – root directory in which to create the input files
mode (str) – how the test-train split is performed. Options: 'kfold' and 'bagging'
n_splits (int) – number of splits of the input structures in training and test sets that ought to be performed; by default no split will be done and all input structures will be used for training
train_fraction (float) – fraction of structures to use for training when mode 'bagging' is used
seed (int) – random number generator seed to be used; this ensures reproducability
overwrite (bool) – if True overwrite the content of rootdir if it exists

Return type:

None

calorine.nep.write_nepfile(parameters, dirname)[source]#

Writes parameters file for NEP construction.

Parameters:

parameters (NamedTuple) – input parameters; see here
dirname (str) – directory in which to place input file and links

Return type:

None

calorine.nep.write_structures(outfile, structures)[source]#

Writes structures for training/testing in format readable by nep executable.

Parameters:

outfile (str) – output filename
structures (list[Atoms]) – list of structures with energy, forces, and (possibly) stresses

Return type:

None

Evaluating models#

TNEP models allow one to represent tensorial properties such as dipole moment, susceptibility, or polarizability. To test and analyze these models calorine provides several specialized functions, which can also be used to implement extended Hamiltonians.

calorine.nep.get_dipole(structure, model_filename=None, debug=False)[source]#

Calculates the dipole for a given structure. A NEP model defined by a nep.txt file needs to be provided.

Parameters:

structure (Atoms) – Input structure
model_filename (Optional[str]) – Path to NEP model in nep.txt format. Defaults to None.
debug (bool) – Flag to toggle debug mode. Prints GPUMD output. Defaults to False.

Return type:

dipole with shape (3,)

calorine.nep.get_dipole_gradient(structure, model_filename=None, backend='c++', method='central difference', displacement=0.01, charge=1.0, nep_command='nep', debug=False)[source]#

Calculates the dipole gradient for a given structure using finite differences. A NEP model defined by a nep.txt file needs to be provided.

Parameters:

structure (Atoms) – Input structure
model_filename (Optional[str]) – Path to NEP model in nep.txt format. Defaults to None.
backend (str) – Backend to use for computing dipole gradient with finite differences. One of 'c++' (CPU), 'python' (CPU) and 'nep' (GPU). Defaults to 'c++'.
method (str) – Method for computing gradient with finite differences. One of ‘forward difference’ and ‘central difference’. Defaults to ‘central difference’
displacement (float) – Displacement in Å to use for finite differences. Defaults to 0.01.
charge (float) – System charge in units of the elemental charge. Used for correcting the dipoles before computing the gradient. Defaults to 1.0.
nep_command (str) – Command for running the NEP executable. Defaults to 'nep'.
debug (bool) – Flag to toggle debug mode. Prints GPUMD output (if applicable). Defaults to False.

Return type:

ndarray

Returns:

dipole gradient with shape (N, 3, 3)

calorine.nep.get_polarizability(structure, model_filename=None, debug=False)[source]#

Calculates the polarizability tensor for a given structure. A NEP model defined by a nep.txt file needs to be provided. The model must be trained to predict the polarizability.

Parameters:

structure (Atoms) – Input structure
model_filename (Optional[str]) – Path to NEP model in nep.txt format. Defaults to None.
debug (bool) – Flag to toggle debug mode. Prints GPUMD output. Defaults to False.

Return type:

polarizability with shape (3, 3)

calorine.nep.get_polarizability_gradient(structure, model_filename=None, displacement=0.01, component='full', debug=False)[source]#

Calculates the dipole gradient for a given structure using finite differences. A NEP model defined by a nep.txt file needs to be provided. This function computes the derivatives using the second-order central difference method with a C++ backend.

Parameters:

structure (Atoms) – Input structure.
model_filename (Optional[str]) – Path to NEP model in nep.txt format. Defaults to None.
displacement (float) – Displacement in Å to use for finite differences. Defaults to 0.01.
component (Union[str, List[str]]) – Component or components of the polarizability tensor that the gradient should be computed for. The following components are available: x`, ``y, z, full. Option full computes the derivative whilst moving the atoms in each Cartesian direction, which yields a tensor of shape (N, 3, 6). Multiple components may be specified. Defaults to full.
debug (bool) – Flag to toggle debug mode. Prints GPUMD output (if applicable). Defaults to False.

Return type:

ndarray

Returns:

polarizability gradient with shape (N, C, 6) where C is the number of components chosen.

calorine.nep.get_potential_forces_and_virials(structure, model_filename=None, debug=False)[source]#

Calculates the per-atom potential, forces and virials for a given structure. A NEP model defined by a nep.txt file needs to be provided.

Parameters:

structure (Atoms) – Input structure
model_filename (Optional[str]) – Path to NEP model. Defaults to None.
debug (bool) – Flag to toggle debug mode. Prints GPUMD output. Defaults to False.

Return type:

Tuple[ndarray, ndarray, ndarray]

Returns:

potential with shape (natoms,)
forces with shape (natoms, 3)
virials with shape (natoms, 9)

Inspecting NEP models#

Once a model has been trained it can be analyzed in more detail. To this end, there are functions for accessing the descriptors, the latent space, or to load the entire model. The latter function (read_model) returns a Model object, which contains the entire information about this model. It is thereby possible not only to query but to manipulate the model and write the result back to disk.

calorine.nep.get_descriptors(structure, model_filename, debug=False)[source]#

Calculates the NEP descriptors for a given structure. A NEP model defined by a nep.txt can additionally be provided to get the NEP3 model specific descriptors.

Parameters:

structure (Atoms) – Input structure
model_filename (str) – Path to NEP model in nep.txt format.
debug (bool) – Flag to toggle debug mode. Prints GPUMD output. Defaults to False.

Return type:

Descriptors for the supplied structure, with shape (number_of_atoms, descriptor components)

calorine.nep.get_latent_space(structure, model_filename=None, debug=False)[source]#

Calculates the latent space representation of a structure, i.e, the activiations in the hidden layer. A NEP model defined by a nep.txt file needs to be provided.

Parameters:

structure (Atoms) – Input structure
model_filename (Optional[str]) – Path to NEP model. Defaults to None.
debug (bool) – Flag to toggle debug mode. Prints GPUMD output. Defaults to False.

Return type:

Activation with shape (natoms, N_neurons)

calorine.nep.read_model(filename)[source]#

Parses a file in nep.txt format and returns the content in the form of a Model object.

Parameters:: filename (str) – Input file name.
Return type:: Model

NEP model class#

class calorine.nep.model.Model(version, model_type, types, radial_cutoff, angular_cutoff, n_basis_radial, n_basis_angular, n_max_radial, n_max_angular, l_max_3b, l_max_4b, l_max_5b, n_descriptor_radial, n_descriptor_angular, n_neuron, n_parameters, n_descriptor_parameters, n_ann_parameters, ann_parameters, q_scaler, radial_descriptor_weights, angular_descriptor_weights, sqrt_epsilon_infinity=None, restart_parameters=None, zbl=None, zbl_typewise_cutoff_factor=None, max_neighbors_radial=None, max_neighbors_angular=None, radial_typewise_cutoff_factor=None, angular_typewise_cutoff_factor=None)[source]#

Objects of this class represent a NEP model in a form suitable for inspection and manipulation. Typically a Model object is instantiated by calling the read_model function.

version#

NEP version.

Type:: int

model_type#

One of potential, dipole or polarizability.

Type:: str

types#

Chemical species that this model represents.

Type:: tuple[str, …]

radial_cutoff#

The radial cutoff parameter in Å.

Type:: float

angular_cutoff#

The angular cutoff parameter in Å.

Type:: float

max_neighbors_radial#

Maximum number of neighbors in neighbor list for radial terms.

Type:: int

max_neighbors_angular#

Maximum number of neighbors in neighbor list for angular terms.

Type:: int

radial_typewise_cutoff_factor#

The radial cutoff factor if use_typewise_cutoff is used.

Type:: float

angular_typewise_cutoff_factor#

The angular cutoff factor if use_typewise_cutoff is used.

Type:: float

zbl#

Inner and outer cutoff for transition to ZBL potential.

Type:: tuple[float, float]

zbl_typewise_cutoff_factor#

Typewise cutoff when use_typewise_cutoff_zbl is used.

Type:: float

n_basis_radial#

Number of radial basis functions $n_\mathrm{basis}^\mathrm{R}$.

Type:: int

n_basis_angular#

Number of angular basis functions $n_\mathrm{basis}^\mathrm{A}$.

Type:: int

n_max_radial#

Maximum order of Chebyshev polymonials included in radial expansion $n_\mathrm{max}^\mathrm{R}$.

Type:: int

n_max_angular#

Maximum order of Chebyshev polymonials included in angular expansion $n_\mathrm{max}^\mathrm{A}$.

Type:: int

l_max_3b#

Maximum expansion order for three-body terms $l_\mathrm{max}^\mathrm{3b}$.

Type:: int

l_max_4b#

Maximum expansion order for four-body terms $l_\mathrm{max}^\mathrm{4b}$.

Type:: int

l_max_5b#

Maximum expansion order for five-body terms $l_\mathrm{max}^\mathrm{5b}$.

Type:: int

n_descriptor_radial#

Dimension of radial part of descriptor.

Type:: int

n_descriptor_angular#

Dimension of angular part of descriptor.

Type:: int

n_neuron#

Number of neurons in hidden layer.

Type:: int

n_parameters#

Total number of parameters including scalers (which are not fit parameters).

Type:: int

n_descriptor_parameters#

Number of parameters in descriptor.

Type:: int

n_ann_parameters#

Number of neural network weights.

Type:: int

ann_parameters#

Neural network weights.

Type:: dict[tuple[str, dict[str, np.darray]]]

q_scaler#

Scaling parameters.

Type:: List[float]

radial_descriptor_weights#

Radial descriptor weights by combination of species; the array for each combination has dimensions of $(n_\mathrm{max}^\mathrm{R}+1) \times (n_\mathrm{basis}^\mathrm{R}+1)$.

Type:: dict[tuple[str, str], np.ndarray]

angular_descriptor_weights#

Angular descriptor weights by combination of species; the array for each combination has dimensions of $(n_\mathrm{max}^\mathrm{A}+1) \times (n_\mathrm{basis}^\mathrm{A}+1)$.

Type:: dict[tuple[str, str], np.ndarray]

sqrt_epsilon_infinity#

Square root of epsilon infinity $epsilon_infty$ (only for NEP models with charges).

Type:: Optional[float]

restart_parameters#

NEP restart parameters. A nested dictionary that contains the mean (mu) and standard deviation (sigma) for the ANN and descriptor parameters. Is set using the py:meth:~Model.read_restart method. Defaults to None.

Type:: dict[str, dict[str, dict[str, np.ndarray]]]

read_restart(filename)[source]#

Parses a file in nep.restart format and saves the content in the form of mean and standard deviation for each parameter in the corresponding NEP model.

Parameters:: filename (str) – Input file name.

remove_species(species)[source]#

Removes one or more species from the model.

This method modifies the model in-place by removing all parameters associated with the specified chemical species. It prunes the species list, the Artificial Neural Network (ANN) parameters, and the descriptor weights. It also recalculates the total number of parameters in the model.

Parameters:: species (list[str]) – A list of species names (str) to remove from the model.
Raises:: ValueError – If any of the provided species is not found in the model.

write(filename)[source]#

Write NEP model to file in nep.txt format.

Return type:: None

write_restart(filename)[source]#: Write NEP restart parameters to file in nep.restart format.