Dielectric functions#

This tutorial demonstrates how to compute the dielectric function tensor \(\epsilon^{\alpha\beta}(\omega) = \epsilon_1^{\alpha\beta}(\omega) + i \epsilon_2^{\alpha\beta}(\omega)\) from a GPUMD molecular-dynamics simulation. GPUMD writes the time derivative of the polarization \(\dot{\mathbf{P}}(t)\) to a file named dpdt.out when using a qNEP model. The imaginary part of the dielectric tensor \(\epsilon_2^{\alpha\beta}(\omega)\) is related to the power spectral density of \(\dot{P}_\alpha(t)\) via

\[\epsilon_2^{\alpha\beta}(\omega) = \frac{1}{\epsilon_0 V k_\mathrm{B} T}\, \int_0^\infty \langle \dot{P}_\alpha(0)\,\dot{P}_\beta(\tau) \rangle\, e^{-i\omega\tau}\, d\tau,\]

where \(V\) is the simulation cell volume, \(T\) is the temperature, and \(\epsilon_0\) is the permittivity of free space. The isotropic average \(\bar{\epsilon}_2(\omega) = \frac{1}{3}\sum_\alpha \epsilon_2^{\alpha\alpha}(\omega)\) reduces to the scalar dielectric function for cubic or isotropic systems. The real part \(\epsilon_1^{\alpha\beta}(\omega)\) is obtained component-wise via the Kramers-Kronig relation

\[\epsilon_1^{\alpha\beta}(\omega) = \frac{2}{\pi}\, \mathcal{P}\int_0^\infty \frac{\omega'\,\epsilon_2^{\alpha\beta}(\omega')}{\omega'^2 - \omega^2}\, d\omega'.\]

The example data are from qNEP simulations of BaTiO\(_3\) at 100, 331, and 450 K from Z. Fan et al., J. Chem. Theory Comput. 22, 4787 (2026).

[1]:

base = 'https://zenodo.org/records/20648097/files'
for T in [100, 331, 450]:
    fname = f'dpdt-BaTiO3-{T}K.txt'
    import os, urllib.request
    if not os.path.exists(fname):
        urllib.request.urlretrieve(f'{base}/{fname}', fname)
        print(f'Downloaded {fname}')
    else:
        print(f'{fname} already present')

dpdt-BaTiO3-100K.txt already present
dpdt-BaTiO3-331K.txt already present
dpdt-BaTiO3-450K.txt already present

Reading the data#

The read_dpdt function reads a dpdt.out file into a pandas.DataFrame. The columns dPx, dPy, dPz are the Cartesian components of \(\dot{\mathbf{P}}(t)\) in units of \(e\)Å/fs, and Px, Py, Pz are the corresponding polarization components in units of \(e\)Å. The time step is read directly from the time column as the difference between consecutive rows.

[2]:

from calorine.gpumd import read_dpdt

temperatures = [100, 331, 450]  # K
dfs_raw = {T: read_dpdt(f'dpdt-BaTiO3-{T}K.txt') for T in temperatures}

dfs_raw[450].head()

[2]:

	time	dPx	dPy	dPz	Px	Py	Pz
0	0.02	-3.06855	-1.117540	1.867190	-61.3709	-22.3507	37.3438
1	0.04	-5.94163	0.009431	1.536860	-180.2030	-22.1621	68.0809
2	0.06	-5.86222	-2.869390	0.291382	-297.4480	-79.5499	73.9085
3	0.08	-4.57303	-4.939870	0.963120	-388.9080	-178.3470	93.1709
4	0.10	-1.08819	-5.468230	-0.305113	-410.6720	-287.7120	87.0687

Computing the dielectric function#

get_dielectric_function computes both parts of the dielectric tensor from all three Cartesian components of \(\dot{\mathbf{P}}(t)\), returning the six unique Voigt components of the imaginary part \(\epsilon_2^{\alpha\beta}(\omega)\) together with the real part \(\epsilon_1^{\alpha\beta}(\omega)\) obtained via the Kramers-Kronig relation. The required physical parameters are

volume: the simulation cell volume in Å\(^3\),
temperature: the temperature in K.

The keyword column='dP' (the default) selects the dPx/dPy/dPz columns from read_dpdt. The time step is extracted automatically from the time column. The optional parameter t_sigma sets the width of a Gaussian window applied to the autocorrelation function before the Fourier transform (in ps); smaller values produce smoother spectra while larger values yield sharper features.

[3]:

%%time
import numpy as np
from calorine.tools import get_dielectric_function

t_sigma = 2  # Gaussian window width in ps; smaller values give smoother spectra

lattice = np.array([[81.047018,  0.023688, -0.017573],
                    [ 0.014659, 80.976750, -0.016819],
                    [-0.017772, -0.026703, 80.939036]])
volume = np.abs(np.linalg.det(lattice))  # Angstrom^3
print(f'Cell volume: {volume:.1f} Angstrom^3')

dfs = {}
for T in temperatures:
    dfs[T] = get_dielectric_function(dfs_raw[T], volume, T, column='dP', t_sigma=t_sigma)
dfs[450].head()

Cell volume: 531196.7 Angstrom^3
CPU times: user 1min 47s, sys: 14.8 s, total: 2min 2s
Wall time: 1min

[3]:

	angular_frequency	wavenumber_invcm	epsilon_imag_xx	epsilon_imag_yy	epsilon_imag_zz	epsilon_imag_yz	epsilon_imag_xz	epsilon_imag_xy	conductivity_xx	conductivity_yy	conductivity_zz	conductivity_yz	conductivity_xz	conductivity_xy	epsilon_real_xx	epsilon_real_yy	epsilon_real_zz	epsilon_real_yz	epsilon_real_xz	epsilon_real_xy
0	0.006283	0.033357	124255.284323	59899.173855	41507.546044	-1965.406627	-6210.587144	7150.681630	6912.770709	3332.407606	2309.214856	-109.342677	-345.517417	397.818272	62862.620438	31549.732382	22417.644882	-946.858042	-2967.715355	3434.593445
1	0.012567	0.066714	62149.270547	29962.758741	20763.426580	-982.586187	-3105.680122	3575.548379	6915.177239	3333.873197	2310.288981	-109.329644	-345.560427	397.841367	31555.872535	16458.496943	11960.202017	-451.554422	-1402.762191	1632.702157
2	0.018850	0.100071	41456.876064	19989.807161	13853.010279	-654.927121	-2070.882673	2383.929419	6919.187705	3336.315735	2312.079141	-109.307891	-345.632071	397.879838	15076.692173	8514.590987	6455.453043	-190.875112	-579.063694	684.318563
3	0.025133	0.133428	31117.883584	15007.720701	10401.019451	-491.058181	-1553.612344	1788.188890	6924.801479	3339.735051	2314.585267	-109.277368	-345.732287	397.933652	9885.493452	6012.157721	4721.399918	-108.756631	-319.587462	385.573541
4	0.031417	0.166785	24920.248748	12021.979431	8332.398456	-392.705044	-1243.352559	1430.799575	6932.017682	3344.130904	2317.807258	-109.238007	-345.860991	398.002767	7561.505183	4891.890736	3945.128978	-71.994153	-203.428301	251.844760

Kramers-Kronig integration method#

The Kramers-Kronig step is performed internally by get_dielectric_function (controlled by return_real_part, which defaults to True). Two integration methods are available via the kk_method keyword:

'vectorized' (default): exact trapezoid rule, \(\mathcal{O}(n^2)\) time and memory.
'fft': \(\mathcal{O}(n \log n)\) Hilbert-transform approximation; faster for large arrays but can deviate near sharp features.

The section below compares both methods on the 100 K data, which has the sharpest phonon peaks and provides the most stringent test.

Visualization#

[4]:

import matplotlib.pyplot as plt

def isotropic_average(df, prefix):
    return (df[f'{prefix}_xx'] + df[f'{prefix}_yy'] + df[f'{prefix}_zz']) / 3


fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(6, 5), sharex=True)

for T in temperatures:
    df = dfs[T]
    ax1.plot(df.wavenumber_invcm, isotropic_average(df, 'epsilon_real'), label=f'{T} K')
    ax2.plot(df.wavenumber_invcm, isotropic_average(df, 'epsilon_imag'), label=f'{T} K')

ax1.axhline(0, color='k', linewidth=0.5, linestyle='--')
ax1.set_ylabel(r'$\widebar{\epsilon}_1(\omega)$')
ax1.legend()
ax1.set_xlim(0, 830)
ax1.set_ylim(-400, 1400)

ax2.set_xlabel(r'Wavenumber (cm$^{-1}$)')
ax2.set_ylabel(r'$\widebar{\epsilon}_2(\omega)$')
ax2.legend()
ax2.set_yscale('log')
ax2.set_ylim(0.1, 3e3)

fig.tight_layout()
fig.subplots_adjust(hspace=0)

../_images/get_started_dielectric_function_8_0.png

Accuracy: FFT vs. vectorized#

dfs[100] was computed with the default kk_method='vectorized'. The cell below recomputes the 100 K spectrum with kk_method='fft' for comparison. The two methods agree well across most of the spectrum, but the FFT approximation can deviate near sharp features and at larger frequencies.

[5]:

df_vec = dfs[100].iloc[5:]
df_fft = get_dielectric_function(dfs_raw[100], volume, 100,
                                 column='dP', t_sigma=t_sigma, kk_method='fft').iloc[5:]

fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(6, 5), sharex=True)

ax1.plot(df_vec.wavenumber_invcm, isotropic_average(df_vec, 'epsilon_real'), label='Vectorized')
ax1.plot(df_fft.wavenumber_invcm, isotropic_average(df_fft, 'epsilon_real'), '--', label='FFT')
ax1.axhline(0, color='k', linewidth=0.5, linestyle='--')
ax1.set_ylabel(r'$\bar{\epsilon}_1(\omega)$')
ax1.legend()

diff = isotropic_average(df_fft, 'epsilon_real').to_numpy() - isotropic_average(df_vec, 'epsilon_real').to_numpy()
ax2.plot(df_vec.wavenumber_invcm, diff)
ax2.axhline(0, color='k', linewidth=0.5, linestyle='--')
ax2.set_xlabel(r'Wavenumber (cm$^{-1}$)')
ax2.set_ylabel('FFT − Vectorized')

fig.tight_layout()
fig.subplots_adjust(hspace=0)

../_images/get_started_dielectric_function_10_0.png

Kramers-Kronig performance#

The timing comparison below illustrates the performance advantage of the 'fft' method over the 'vectorized' approach, which comes at a small cost in accuracy (see above).

[6]:

from calorine.tools import apply_kramers_kronig
import time
from pandas import DataFrame

sizes = [100, 500, 1000, 5000, 10000]
times_vec = []
times_fft = []

for n in sizes:
    df_test = DataFrame({'angular_frequency': np.linspace(0.1, 100.0, n),
                         'epsilon_imag': np.sin(np.linspace(0.1, 100.0, n))})

    nrep = max(3, 200 // n)
    t0 = time.perf_counter()
    for _ in range(nrep):
        apply_kramers_kronig(df_test, method='vectorized')
    times_vec.append((time.perf_counter() - t0) / nrep * 1000)

    t0 = time.perf_counter()
    for _ in range(500):
        apply_kramers_kronig(df_test, method='fft')
    times_fft.append((time.perf_counter() - t0) / 500 * 1000)

fig, ax = plt.subplots(figsize=(6, 3))
ax.loglog(sizes, times_vec, 'o-', label=r"method='vectorized', $\mathcal{O}(n^2)$")
ax.loglog(sizes, times_fft, 's-', label=r"method='fft', $\mathcal{O}(n \log n)$")
ax.set_xlabel('Array length $n$')
ax.set_ylabel('Wall time (ms)')
ax.legend()
fig.tight_layout()

../_images/get_started_dielectric_function_12_0.png