Towards machine learning for the weather: developing methods using simplified dynamical systems
Repository URI
Repository DOI
Change log
Authors
Abstract
Forecasting the behaviour of systems is one of the most important modelling exercises. Weather forecasting is a major enterprise within this. Recently, there has been a push to use machine learning (ML) here, given ML's ability to learn relationships which are too complex for humans to specify. This thesis presents work towards improving ML models for the weather, by developing methods and proof-of-concepts on simplified dynamical systems. The work explores three key design choices: approach, data, and model.
The first part of the thesis explores a hybrid approach that integrates physical principles with ML, in the context of atmospheric parameterization. Atmospheric parameterization involves modelling unresolved processes, such as cloud formation, within numerical weather prediction (NWP) frameworks. Classical parameterizations rely heavily on empirically derived equations, which can limit their accuracy and generalizability. This section applies ML to parameterization tasks, leveraging their ability to learn data-driven components whilst retaining existing physical knowledge.
Focusing on ML model design, Chapter 3 uses the Lorenz 96 model to examine limitations in existing ML tools' ability to capture temporal correlations in stochastic systems. The chapter proposes a recurrent neural network (RNN) architecture to address these deficiencies, demonstrating improved performance in replicating the chaotic dynamics of the Lorenz 96 system. Chapter 4 then examines a discrepancy in model performance between offline evaluation and iterative simulations (rollouts). Using the cloud modelling within the European Centre for Medium-Range Weather Forecasting’s Single Column Model as a case study, this work highlights how targeted model development can improve online performance by addressing deficiencies that are masked during offline evaluation.
The second part of the thesis leaves the parameterization setting and explores broader challenges related to data and model design for ML for weather. Data availability is critical for training ML models, but practical limitations often mean that potentially useful datasets have mismatched temporal or spatial resolutions. Chapter 5 introduces a transfer learning framework to address these challenges by pretaining models on data at certain resolutions and fine-tuning them for another. This framework shows how ML models can be used to learn from heterogeneous datasets, improving generalization and forecast skill in data-scarce regimes.
The final part of the thesis examines further challenges in ML model design. Chapter 6 introduces the Taylorformer, a novel architecture for continuous processes inspired by Taylor series expansions and Gaussian Process-based attention mechanisms. The model can capture uncertainty and is suitable at both interpolation and extrapolation prediction tasks on unevenly sampled data. Chapter 7 extends the investigation into the pervasive issue of error accumulation in iterative ML forecasting, where small errors in individual predictions compound over time, degrading long-term accuracy. The chapter presents SynergiNet, a model designed to mitigate such errors through a combination of architectural enhancements and innovative training strategies. SynergiNet is a proof-of-concept which shows robust and realistic long-term predictions on the Lorenz 63 system.
Description
Date
Advisors
Christensen, Hannah
Hosking, Scott

