Repository logo

Machine-assisted synthesis and development in pharmaceutical industry



Change log


Jorayev, Perman 


Machine-assisted synthesis and development in pharmaceutical industry

Perman Jorayev

Process development of novel chemical transformations is often a laborious and complex task. This is mostly due to the difficulties in identifying the underlying reaction mechanism(s), selection of chemical (intrinsic) and physical (process) parameters that affect the process objective(s), quantifying the nonlinear interactions between them, and lack of data. Reducing the cost and time for development of robust processes from novel chemical transformations, therefore, requires more efficient solutions to address the individual challenges. In this project, we present several new workflows to tackle some of these challenges.

First, we explored use of black-box Bayesian optimisation algorithm TS-EMO for optimisation of complex reaction networks, such as the bio-waste crude sulphate turpentine conversion to functional molecules, with no prior mechanistic information. Using Gaussian processes as surrogate models and sampling from the reaction space based on the highest expected hypervolume improvement, algorithm-guided optimisation of eight continuous variables allowed for identification of the experimental Pareto front to account for the trade-offs between the reaction objectives conversion and yield. Then, expanding to the discrete variable space, we developed a solvent recommendation workflow based on similarity and data fusion techniques, and sustainability guides. Based on two solvent libraries, we demonstrated the use molecular informatics-driven workflow on various chemical transformations, and identified a significant overlap with solvent selection tools developed by AstraZeneca and Syngenta.

Next, considering all continuous and discrete variables in the reaction, a holistic modelling of Buchwald-Hartwig amine synthesis using DFT-based parameterisation techniques and a dozen machine learning algorithms led to development of highly predictive reaction models. Through several Explainable AI tools, reaction specific descriptors were identified, and their transferability was validated in the laboratory on a similar reaction. Finally, in order to develop a robust process of a sensitive photoredox amine synthesis reaction, we generated a priori knowledge in the form of solubility predictions and measurements, and quantification of absorbance and photon flux. Using a recently developed NEMO algorithm, we demonstrated simultaneous optimisation of continuous and discrete variables for reaction objectives yield and cost. The workflows developed in this project, as demonstrated on multiple case studies, validated their efficiency and use in process development.





Lapkin, Alexei


Bayesian optimisation, Continuous flow automation, Machine Learning, Process development


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
UCB Pharma CARES Cambridge