Repository logo

A machine learning based intramolecular potential for a flexible organic molecule.

Published version



Change log


Csányi, Gábor 


Quantum mechanical predictive modelling in chemistry and biology is often hindered by the long time scales and large system sizes required of the computational model. Here, we employ the kernel regression machine learning technique to construct an analytical potential, using the Gaussian Approximation Potential software and framework, that reproduces the quantum mechanical potential energy surface of a small, flexible, drug-like molecule, 3-(benzyloxy)pyridin-2-amine. Challenges linked to the high dimensionality of the configurational space of the molecule are overcome by developing an iterative training protocol and employing a representation that separates short and long range interactions. The analytical model is connected to the MCPRO simulation software, which allows us to perform Monte Carlo simulations of the small molecule bound to two proteins, p38 MAP kinase and leukotriene A4 hydrolase, as well as in water. We demonstrate that our machine learning based intramolecular model is transferable to the condensed phase, and demonstrate that the use of a faithful representation of the quantum mechanical potential energy surface can result in corrections to absolute protein-ligand binding free energies of up to 2 kcal mol-1 in the example studied here.



Epoxide Hydrolases, Machine Learning, Monte Carlo Method, Organic Chemicals, Protein Binding, Quantum Theory, Software, Thermodynamics, p38 Mitogen-Activated Protein Kinases

Journal Title

Faraday Discuss

Conference Name

Journal ISSN


Volume Title



Royal Society of Chemistry (RSC)
Engineering and Physical Sciences Research Council (EP/J010847/1)
Engineering and Physical Sciences Research Council (EP/P022596/1)