Skip to main content

A script to make molecular dynamics (MD) datasets for neural networks from given LAMMPS trajectories automatically.

Project description


DOI:10.1038/s41467-020-19497-z python version PyPI codecov Research Group

MDDatasetBuilder is a script to construct reference datasets for the training of neural network potentials from given LAMMPS trajectories.

Complex reaction processes in combustion unraveled by neural network-based molecular dynamics simulation, Nature Communications, 11, 5713 (2020), DOI: 10.1038/s41467-020-19497-z

Author: Jinzhe Zeng



MDDatasetBuilder can be installed with pip:

pip install mddatasetbuilder

The installation process should be very quick, taking only a few minutes on a “normal” desktop computer.


Simple example

A LAMMPS dump file should be prepared. A LAMMPS bond file can be added for the addition information.

datasetbuilder -d dump.ch4 -b bonds.reaxc.ch4_new -a C H O -n ch4 -i 25

Here, dump.ch4 is the name of the dump file. bonds.reaxc.ch4_new is the name of the bond file, which is optional. C H O is the element in the trajectory. ch4 is the name of the dataset. 25 means the time step interval and the default value is 1.

Then you can generate Gaussian input files for each structure in the dataset and calculate the potential energy & atomic forces (assume the Gaussian 16 has already been installed.):

qmcalc -d dataset_ch4_GJf/000
qmcalc -d dataset_ch4_GJf/001

Next, prepare a DeePMD dataset and use DeePMD-kit to train a NN model.

preparedeepmd -p dataset_ch4_GJf
cd train && dp train train.json

The runtime of the software depends on the amount of data. It is more suited to running on a server rather than desktop computer.


The MDDatasetBuilder package has been integrated with DP-GEN software:

dpgen init_reaction reaction.json machine.json

where an example of reaction.json can be found here, and machine.json should include the following keys: reaxff_command, reaxff_resources, reaxff_machine, build_command, build_resources, build_machine, fp_command, fp_resources, fp_machine, and fp_group_size. reaxff_command is the LAMMPS command, build_command is the MDDatasetbuilder command, and fp_command is the Gaussian 16 command.

Project details

Download files

Download the file for your platform. If you're not sure which to choose, learn more about installing packages.

Source Distribution

mddatasetbuilder-1.3.2.tar.gz (22.6 kB view hashes)

Uploaded Source

Supported by

AWS AWS Cloud computing and Security Sponsor Datadog Datadog Monitoring Fastly Fastly CDN Google Google Download Analytics Microsoft Microsoft PSF Sponsor Pingdom Pingdom Monitoring Sentry Sentry Error logging StatusPage StatusPage Status page