Using Machine Learning to Optimise Asymmetric Hydrogenation Reactions of Tetra-substituted Olefins
Repository URI
Repository DOI
Change log
Authors
Abstract
The presence of 1,2-contiguous stereocenters represents an especially valuable motif in drug discovery. However, direct access through the asymmetric hydrogenation (AH) of tetra-substituted olefins (TSC=C) remains a persistent challenge in homogenous catalysis. The high steric hindrance of these substrates, combined with the strong catalyst-substrate specificity of the current systems, makes it challenging to achieve high enantioselectivities for novel substrates. As a result, optimisation campaigns generally demand extensive catalyst screening. This thesis explores an alternative strategy to accelerate the optimisation of these transformations using machine learning (ML). Chapter 1 outlines the key advancements in the AH of TSC=C, introduces the theoretical basis of applying ML to reaction outcome prediction, and reviews recent studies that employ ML in reaction optimisation. Chapter 2 describes the construction of a curated literature dataset of AH reactions of TSC=C and evaluates a range of ML models for predicting enantiomeric excess (ee). Both regression and classification models were constructed, with tree-based ensembles delivering the strongest performance, achieving absolute errors below 10% ee and classification accuracies above 80%. Chapter 3 applies these models to novel experimental AH reactions of TSC=C, evaluating their ability to perform in new areas of chemical space. While predictive accuracy declined on low-ee reactions, the results highlight both the potential and current limitations of the ML models in guiding optimisation beyond the boundaries of existing datasets. Chapter 4 investigates Bayesian optimisation (BO) as an alternative workflow for experimental optimisation of this transformation. Using Gaussian process regression and multi-objective acquisition functions, the iterative BO campaign efficiently explored reaction space and successfully identified conditions that afforded enantioselectivities of up to 93% ee, thereby demonstrating BO as a powerful and generalisable strategy for accelerating discovery in this challenging transformation.
