Repository logo
 

Towards Machine Learning Foundation Models for Materials Chemistry


Loading...
Thumbnail Image

Type

Change log

Abstract

This thesis demonstrates how recent advances in machine learning (ML) for materials can accelerate our search for new stable inorganic crystals. We show how best to measure and compare the utility of different models, what range of applications a foundational ML force field can be expected to cover, what remaining shortcomings current ML potentials exhibit and how they can be overcome. Specifically, we present Matbench Discovery, an evaluation framework that closely mimics a real-world materials discovery campaign and establishes universal interatomic potentials (UIPs) as the state-of-the-art methodology for accelerating the discovery of thermodynamically stable inorganic crystals. Next, we design and execute an ML-guided dielectric discovery workflow that integrates rapid ML screening, targeted crystal generation, high-throughput ab-initio validation all feeding into informed experimental characterization which culminates in the synthesis of two novel dielectric materials, CsTaTeO6 and Bi2Zr2O7, with CsTaTeO6 generated by our workflow. Finally, we comprehensively analyze MACE-MP, the best-performing model we trained for Matbench Discovery, which has since proven to be a highly versatile foundation model for atomistic simulations. While pre-trained purely on inorganic bulk crystals, it exhibits remarkable extrapolation to diverse chemistries and material classes far beyond its training distribution, as evidenced by its qualitative and often even quantitative agreement with density functional theory (DFT) in 36 diverse test cases, including phonon spectra, ammonia/borane, amorphous carbon, aqueous interfaces, batteries, carburane, cathode materials, combustion, dichalcogenides, dislocation, dissolution, heterogeneous catalysis, hydrogen, ice & water, MOFs, molten salts, multi-component alloys, nanoparticles, Perovskites, polymerization, Pt surface, Si interstitial, solvent mixtures, zeolites. Taken together, these projects showcase how graph neural network (GNN) force fields can form a central pillar of computational materials science, inhabiting a different point on the cost-accuracy Pareto front compared to DFT, not much worse in accuracy yet orders of magnitude cheaper thanks to linear instead of cubic scaling with system size. Machine learning force fields have thus unlocked the study of complex phenomena over length and time scales previously inaccessible to numerical simulation.

Date

2024-06-04

Advisors

Keyser, Ulrich
Lee, Alpha

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as All Rights Reserved
Sponsorship
German Academic Scholarship Foundation (German: Studienstiftung)

Collections