NEP interface#
Training IO#
calorine provides a number of functions for preparing input files for training NEP models, including in particular the setup_training
function.
There are also several functions for analyzing the training process, including, e.g., the read_loss
, read_structures
, and get_parity_data
functions.
- calorine.nep.get_parity_data(structures, property, selection=None, flatten=True)[source]#
Returns the predicted and target energies, forces, virials or stresses from a list of structures in a format suitable for generating parity plots.
The structures should have been read using
read_structures
, such that theinfo
-object is populated with keys on the form<property>_<type>
where<property>
is one ofenergy
,force
,virial
, and stress, and<type>
is one ofpredicted
ortarget
.The resulting parity data is returned as a tuple of dicts, where each entry corresponds to a list.
- Parameters:
structures (
List
[Atoms
]) – List of structures as read withread_structures
.property (
str
) – One ofenergy
,force
,virial
,stress
,polarizability
,dipole
.selection (
List
[str
]) – A list containing which components to return, and/or the absolute value. Possible values arex
,y
,z
,xx
,yy
,zz
,yz
,xz
,xy
,abs
,pressure
.flatten (
bool
) – if True return flattened lists; this is useful for flattening the components of force or virials into a simple list
- Return type:
DataFrame
- calorine.nep.read_loss(filename)[source]#
Parses a file in
loss.out
format from GPUMD and returns the content as a data frame. More information concerning file format, content and units can be found here.- Parameters:
filename (
str
) – input file name- Return type:
DataFrame
- calorine.nep.read_nepfile(filename)[source]#
Returns the content of a configuration file (
nep.in
) as a dictionary.- Parameters:
filename (
str
) – input file name- Return type:
Dict
[str
,Any
]
- calorine.nep.read_structures(dirname)[source]#
Parses the
energy_*.out
,force_*.out
,virial_*.out
,polarizability_*.out
anddipole_*.out
files from a nep run and returns their content as lists. The first and second list contain the structures from the training and test sets, respectively. Each list entry corresponds to an ASE Atoms object, which in turn contains predicted and target energies, forces and virials/stresses or polarizability/diople stored in theinfo
property.
- calorine.nep.setup_training(parameters, structures, enforced_structures=[], rootdir='.', mode='kfold', n_splits=None, train_fraction=None, seed=42, overwrite=False)[source]#
Sets up the input files for training a NEP via the
nep
executable of the GPUMD package.- Parameters:
parameters (
NamedTuple
) – dictionary containing the parameters to be set in the nep.in file; see here for an overview of these parametersstructures (
List
[Atoms
]) – list of structures to be includedenforced_structures (
List
[int
]) – structures that _must_ be included in the training set, provided in the form of a list of indices that refer to the content of thestructures
parameterrootdir (
str
) – root directory in which to create the input filesmode (
str
) – how the test-train split is performed. Options:'kfold'
and'bagging'
n_splits (
int
) – number of splits of the input structures in training and test sets that ought to be performed; by default no split will be done and all input structures will be used for trainingtrain_fraction (
float
) – fraction of structures to use for training when mode'bagging'
is usedseed (
int
) – random number generator seed to be used; this ensures reproducabilityoverwrite (
bool
) – if True overwrite the content ofrootdir
if it exists
- Return type:
None
Evaluating models#
TNEP models allow one to represent tensorial properties such as dipole moment, susceptibility, or polarizability. To test and analyze these models calorine provides several specialized functions, which can also be used to implement extended Hamiltonians.
- calorine.nep.get_dipole(structure, model_filename=None, debug=False)[source]#
Calculates the dipole for a given structure. A NEP model defined by a
nep.txt
file needs to be provided.- Parameters:
structure (
Atoms
) – Input structuremodel_filename (
Optional
[str
]) – Path to NEP model innep.txt
format. Defaults toNone
.debug (
bool
) – Flag to toggle debug mode. Prints GPUMD output. Defaults toFalse
.
- Return type:
dipole with shape
(3,)
- calorine.nep.get_dipole_gradient(structure, model_filename=None, backend='c++', method='central difference', displacement=0.01, charge=1.0, nep_command='nep', debug=False)[source]#
Calculates the dipole gradient for a given structure using finite differences. A NEP model defined by a
nep.txt
file needs to be provided.- Parameters:
structure (
Atoms
) – Input structuremodel_filename (
Optional
[str
]) – Path to NEP model innep.txt
format. Defaults toNone
.backend (
str
) – Backend to use for computing dipole gradient with finite differences. One of'c++'
(CPU),'python'
(CPU) and'nep'
(GPU). Defaults to'c++'
.method (
str
) – Method for computing gradient with finite differences. One of ‘forward difference’ and ‘central difference’. Defaults to ‘central difference’displacement (
float
) – Displacement in Å to use for finite differences. Defaults to0.01
.charge (
float
) – System charge in units of the elemental charge. Used for correcting the dipoles before computing the gradient. Defaults to1.0
.nep_command (
str
) – Command for running the NEP executable. Defaults to'nep'
.debug (
bool
) – Flag to toggle debug mode. Prints GPUMD output (if applicable). Defaults toFalse
.
- Return type:
- Returns:
dipole gradient with shape
(N, 3, 3)
- calorine.nep.get_polarizability(structure, model_filename=None, debug=False)[source]#
Calculates the polarizability tensor for a given structure. A NEP model defined by a
nep.txt
file needs to be provided. The model must be trained to predict the polarizability.- Parameters:
structure (
Atoms
) – Input structuremodel_filename (
Optional
[str
]) – Path to NEP model innep.txt
format. Defaults toNone
.debug (
bool
) – Flag to toggle debug mode. Prints GPUMD output. Defaults toFalse
.
- Return type:
polarizability with shape
(3, 3)
- calorine.nep.get_polarizability_gradient(structure, model_filename=None, displacement=0.01, component='full', debug=False)[source]#
Calculates the dipole gradient for a given structure using finite differences. A NEP model defined by a
nep.txt
file needs to be provided. This function computes the derivatives using the second-order central difference method with a C++ backend.- Parameters:
structure (
Atoms
) – Input structure.model_filename (
Optional
[str
]) – Path to NEP model innep.txt
format. Defaults toNone
.displacement (
float
) – Displacement in Å to use for finite differences. Defaults to0.01
.component (
Union
[str
,List
[str
]]) – Component or components of the polarizability tensor that the gradient should be computed for. The following components are available:x`, ``y
,z
,full
. Optionfull
computes the derivative whilst moving the atoms in each Cartesian direction, which yields a tensor of shape(N, 3, 6)
. Multiple components may be specified. Defaults tofull
.debug (
bool
) – Flag to toggle debug mode. Prints GPUMD output (if applicable). Defaults toFalse
.
- Return type:
- Returns:
polarizability gradient with shape
(N, C, 6)
whereC
is the number of components chosen.
- calorine.nep.get_potential_forces_and_virials(structure, model_filename=None, debug=False)[source]#
Calculates the per-atom potential, forces and virials for a given structure. A NEP model defined by a
nep.txt
file needs to be provided.- Parameters:
structure (
Atoms
) – Input structuremodel_filename (
Optional
[str
]) – Path to NEP model. Defaults to None.debug (
bool
) – Flag to toggle debug mode. Prints GPUMD output. Defaults to False.
- Return type:
- Returns:
potential with shape
(natoms,)
forces with shape
(natoms, 3)
virials with shape
(natoms, 9)
Inspecting NEP models#
Once a model has been trained it can be analyzed in more detail.
To this end, there are functions for accessing the descriptors, the latent space, or to load the entire model.
The latter function (read_model
) returns a Model
object, which contains the entire information about this model.
It is thereby possible not only to query but to manipulate the model and write the result back to disk.
- calorine.nep.get_descriptors(structure, model_filename=None, debug=False)[source]#
Calculates the NEP descriptors for a given structure. A NEP model defined by a nep.txt can additionally be provided to get the NEP3 model specific descriptors.
- Parameters:
structure (
Atoms
) – Input structuremodel_filename (
Optional
[str
]) – Path to NEP model innep.txt
format. Defaults toNone
.debug (
bool
) – Flag to toggle debug mode. Makes the generated dummy NEP2 model available in a local tmp directory, as well as prints GPUMD output. Defaults toFalse
.
- Return type:
Descriptors for the supplied structure, with shape (number_of_atoms, descriptor components)
- calorine.nep.get_latent_space(structure, model_filename=None, debug=False)[source]#
Calculates the latent space representation of a structure, i.e, the activiations in the hidden layer. A NEP model defined by a
nep.txt
file needs to be provided.- Parameters:
structure (
Atoms
) – Input structuremodel_filename (
Optional
[str
]) – Path to NEP model. Defaults to None.debug (
bool
) – Flag to toggle debug mode. Prints GPUMD output. Defaults to False.
- Return type:
Activation with shape
(natoms, N_neurons)
NEP model class#
- class calorine.nep.model.Model(version, model_type, types, radial_cutoff, angular_cutoff, n_basis_radial, n_basis_angular, n_max_radial, n_max_angular, l_max_3b, l_max_4b, l_max_5b, n_descriptor_radial, n_descriptor_angular, n_neuron, n_parameters, n_descriptor_parameters, n_ann_parameters, ann_parameters, q_scaler, radial_descriptor_weights, angular_descriptor_weights, zbl=None, zbl_typewise_cutoff_factor=None, max_neighbors_radial=None, max_neighbors_angular=None, radial_typewise_cutoff_factor=None, angular_typewise_cutoff_factor=None)[source]#
Objects of this class represent a NEP model in a form suitable for inspection and manipulation. Typically a
Model
object is instantiated by calling theread_model
function.- version#
NEP version.
- Type:
int
- model_type#
One of
potential
,dipole
orpolarizability
.- Type:
str
- types#
Chemical species that this model represents.
- Type:
Tuple[str, …]
- radial_cutoff#
The radial cutoff parameter in Å.
- Type:
float
- angular_cutoff#
The angular cutoff parameter in Å.
- Type:
float
- max_neighbors_radial#
Maximum number of neighbors in neighbor list for radial terms.
- Type:
int
- max_neighbors_angular#
Maximum number of neighbors in neighbor list for angular terms.
- Type:
int
- radial_typewise_cutoff_factor#
The radial cutoff factor if use_typewise_cutoff is used.
- Type:
float
- angular_typewise_cutoff_factor#
The angular cutoff factor if use_typewise_cutoff is used.
- Type:
float
- zbl#
Inner and outer cutoff for transition to ZBL potential.
- Type:
Tuple[float, float]
- zbl_typewise_cutoff_factor#
Typewise cutoff when use_typewise_cutoff_zbl is used.
- Type:
float
- n_basis_radial#
Number of radial basis functions \(n_\mathrm{basis}^\mathrm{R}\).
- Type:
int
- n_basis_angular#
Number of angular basis functions \(n_\mathrm{basis}^\mathrm{A}\).
- Type:
int
- n_max_radial#
Maximum order of Chebyshev polymonials included in radial expansion \(n_\mathrm{max}^\mathrm{R}\).
- Type:
int
- n_max_angular#
Maximum order of Chebyshev polymonials included in angular expansion \(n_\mathrm{max}^\mathrm{A}\).
- Type:
int
- l_max_3b#
Maximum expansion order for three-body terms \(l_\mathrm{max}^\mathrm{3b}\).
- Type:
int
- l_max_4b#
Maximum expansion order for four-body terms \(l_\mathrm{max}^\mathrm{4b}\).
- Type:
int
- l_max_5b#
Maximum expansion order for five-body terms \(l_\mathrm{max}^\mathrm{5b}\).
- Type:
int
- n_descriptor_radial#
Dimension of radial part of descriptor.
- Type:
int
- n_descriptor_angular#
Dimension of angular part of descriptor.
- Type:
int
- n_neuron#
Number of neurons in hidden layer.
- Type:
int
- n_parameters#
Total number of parameters including scalers (which are not fit parameters).
- Type:
int
- n_descriptor_parameters#
Number of parameters in descriptor.
- Type:
int
- n_ann_parameters#
Number of neural network weights.
- Type:
int
- ann_parameters#
Neural network weights.
- Type:
Dict[Tuple[str, Dict[str, np.darray]]]
- q_scaler#
Scaling parameters.
- Type:
List[float]
- radial_descriptor_weights#
Radial descriptor weights by combination of species; the array for each combination has dimensions of \((n_\mathrm{max}^\mathrm{R}+1) \times (n_\mathrm{basis}^\mathrm{R}+1)\).
- Type:
Dict[Tuple[str, str], np.ndarray]
- angular_descriptor_weights#
Angular descriptor weights by combination of species; the array for each combination has dimensions of \((n_\mathrm{max}^\mathrm{A}+1) \times (n_\mathrm{basis}^\mathrm{A}+1)\).
- Type:
Dict[Tuple[str, str], np.ndarray]