Advancing Normalising Flows to Model Boltzmann Distributions
Repository URI
Repository DOI
Change log
Authors
Abstract
Molecules can assume a variety of configurations. They occur in a stochastic manner according to the so-called Boltzmann distribution. Modelling and sampling from it is important for many applications, such as understanding biochemical processes and diseases, as well as discovering drugs. Traditional methods to sample from Boltzmann distributions are expensive. Instead, the distribution can be approximated with machine learning models, which serve as a surrogate. Popular for this task are normalising flows, which are tractable density models. This thesis is focused on ameliorating methods to implement, sample from, and train normalising flows as well as enhance their architecture with emphasis on the application to Boltzmann distributions. First, a package for implementing normalising flows in Python is introduced. It encompasses most common flow architectures and provides many tools especially useful for modelling Boltzmann distributions, such as sampling layers and flows to model periodic coordinates, e.g. angles. Next, we will present a hyperparameter tuning procedure for the sampling method Hamiltonian Monte Carlo. It runs gradient-based optimisation on a variational objective. Hamiltonian Monte Carlo can be applied on top of a normalising flow to enhance its samples. With the improved hyperparameters thanks to our method, we can approximate the Boltzmann distribution of the molecule alanine dipeptide more closely. Thirdly, normalising flows are made more expressive by the introduction of a novel base distribution. It alters the topology of the base distribution via rejection sampling with a learnt acceptance function. Thereby, we overcome an architectural weakness of normalising flows and get better performance on a diverse set of applications. Moreover, to address the reliance of most training methods for flows on data, which is expensive to generate, we introduce a technique that only requires the unnormalised density of the target. During training, we sample from the flow and move the samples closer to the target via annealed importance sampling. Thereby, we can estimate our mass-covering objective, that also leads to an improved sampling efficiency. With our approach, we are the first to learn the full Boltzmann distribution of alanine dipeptide without any training data, but just using the target density. At last, we incorporate the rotational and translational symmetry, that most physical systems posses, into a flow model. We are among the pioneers learning the Boltzmann distribution of alanine dipeptide in Cartesian coordinates, while our model is significantly faster to evaluate than competing approaches.
Description
Date
Advisors
Schölkopf, Bernhard