High-throughput computational workflow for ligand discovery in catalysis with the CSD
Published version
Peer-reviewed
Repository URI
Repository DOI
Type
Change log
Authors
Abstract
A novel semi-automated, high-throughput computational workflow for ligand/catalyst discovery based on the Cambridge Structural Database is reported. A novel semi-automated, high-throughput computational workflow for ligand/catalyst discovery based on the Cambridge Structural Database is reported. Two potential transition states of the Ullmann–Goldberg reaction were identified and used as a template for a ligand search within the CSD, leading to >32 000 potential ligands. The Δ G ‡ for catalysts using these ligands were calculated using B97-3c//GFN2-xTB with high success rates and good correlation compared to DLPNO-CCSD(T)/def2-TZVPP. Furthermore, machine learning models were developed based on the generated data, leading to accurate predictions of Δ G ‡ , with 70.6–81.5% of predictions falling within ± 4 kcal mol −1 of the calculated Δ G ‡ , without the need for the costly calculation of the transition state. This accuracy of machine learning models was improved to 75.4–87.8% using descriptors derived from TPSS/def2-TZVP//GFN2-xTB calculations with a minimal increase in computational time. This new workflow offers significant advantages over currently used methods due to its faster speed and lower computational cost, coupled with excellent accuracy compared to higher-level methods.
Description
Acknowledgements: This research was carried out at the EPSRC Centre for Doctoral Training in Complex Particulate Products and Processes (EP/S022473/1) as part of a collaborative project with the Cambridge Crystallographic Data Centre (CCDC), who we gratefully acknowledge.
Journal Title
Conference Name
Journal ISSN
2044-4761
Volume Title
Publisher
Publisher DOI
Rights and licensing
Sponsorship
Cambridge Crystallographic Data Centre (Unassigned)

