sGDML Documentation

This is a highly optimized implementation of the recently proposed symmetric gradient domain machine learning (sGDML) force field model [1] [2]. It is able to faithfully reproduce detailed global potential energy surfaces (PES) for small- and medium-sized molecules from a limited number of user-provided reference calculations.

We provide a set of Python routines to reconstruct and evaluate custom sGDML force fields [3]. A user-friendly command-line interface offers assistance through the complete process of model creation, in an effort to make this novel machine learning approach accessible to broad practitioners.

It’s easy to get going!

Here is how to reconstruct an ethanol force field using 200 examples from the published benchmark dataset:

$ sgdml-get dataset ethanol_dft
$ sgdml all ethanol_dft.npz 200 1000 5000

We use another 1000 points as validation dataset and finally estimate the generalization error of our trained model on additional 5000 geometries. All of these subsets are automatically sampled from the provided bulk dataset ethanol_dft.npz.

The program output will look something like this:


Code Development

The sGDML code is developed through our GitHub repository:


Please cite GDML and sGDML as follows:

[1]Chmiela, S., Sauceda, H. E., Müller, K.-R., Tkatchenko, A. (2018). Towards Exact Molecular Dynamics Simulations with Machine-Learned Force Fields. Nat. Commun., 9(1), 3887.
[2]Chmiela, S., Tkatchenko, A., Sauceda, H. E., Poltavsky, Igor, Schütt, K. T., Müller, K.-R. (2017). Machine Learning of Accurate Energy-conserving Molecular Force Fields. Sci. Adv., 3(5), e1603015.
[3]Chmiela, S., Sauceda, H. E., Poltavsky, Igor, Müller, K.-R., Tkatchenko, A. (2018). sGDML: Constructing Accurate and Data Efficient Molecular Force Fields Using Machine Learning. arXiv:1812.04986.


This code is freely available under the terms of the MIT license.