## Invariant polynomials and machine learning

##### Repository URI

##### Repository DOI

##### Change log

##### Authors

##### Abstract

In this thesis, we demonstrate the benefit of incorporating our knowledge of the symmetries of certain systems into the machine learning methods used to model them. By borrowing the necessary tools from commutative algebra and invariant theory, we construct systematic methods to obtain sets of invariant variables that describe two such systems: particle physics and physical chemistry. In both cases, our systems are described by a collection of vectors. In the former, these vectors represent the particle momenta in collision events where our system is Lorentz- and permutation-invariant. In the latter, the vectors represent the positions of atoms in molecules (or lattices) where our systems are Euclidean- and permutation-invariant.

We start by focusing on the algebras of invariant polynomials in these vectors and design systematic methods to obtain sets of generating variables. To do so, we build on two theorems of Weyl which tell us that the algebra of orthogonal group-invariant polynomials in

Armed with the knowledge of characterising invariant algebras, we then discuss why they are indeed sufficient for our purposes. More precisely, we discuss and prove some approximation theorems which allow us to use the generators of invariant algebras to approximate continuous invariant functions in machine learning algorithms in general and in neural networks specifically. Finally, we implement these variables in neural networks applied to regression tasks to test the efficiency improvements. We perform our testing for a variety of hyperparameter choices and find an overall reduction of the loss on training data and a significant reduction of the loss on validation data. For a different approach on quantifying the performance of these neural networks, we treat the problem from a Bayesian inference perspective and employ nested sampling techniques to perform model comparison. Beyond a certain network size, we find that networks utilising Hironaka decompositions perform the best.