Optimisation of CHO Cell Line Development Using Hybrid Modelling for Antibody Therapeutics
Repository URI
Repository DOI
Change log
Authors
Abstract
Scaling up pharmaceutical production processes is an intricate and resource-intensive task, often starting inefficiently at the smallest scale. This approach is costly, time-consuming, and unsustainable, particularly for personalised medicines and biotherapeutics, which hold the potential to be the safest and most effective therapeutic of all [1, 2]. Herein I propose a novel method for optimising upstream biopharmaceutical process scaling up using a data-driven hybrid model that integrates machine learning (ML) with first-principle mechanistic models. The hybrid tool forecasts bioreactor metabolite profiles from simplistic micro- and small-scale cultivation data, improving knowledge, production efficiency, sustainability, and reducing environmental harm. This thesis connects laboratory-based processes with the long-term strategic goals of the pharmaceutical industry, reducing scale-up risks and ensuring that production goals are achieved at the commercialisation stage, making potent biotherapeutics available to all. Monoclonal antibodies (mAbs) are the most rapidly growing class of biotherapeutics, due to their success in treating severe diseases such as cancer and autoimmune disorders. Chinese Hamster Ovary (CHO) cells are the primary host for mAb production due to their flexible karyotype and human-like post-translational modifications. However, their flexibility poses challenges due to their genetic instability and expression variability upon longer cultivation. To address this, an upstream cell line development (CLD) process screens transfected CHO cell lines for stable growth, productivity, and mAb quality. CLD is an essential stage on the critical path of bringing a novel and safe biotherapeutic to the market, often resource-intensive and highly empirical, lasting typically 5-12 months, and is identified as the primary target for biopharmaceutical process optimisation, reducing their manufacturing costs. This thesis introduces a hybrid modelling tool specifically for CLD optimisation, combining ML and kinetic modelling to make automated and earlier cell line selection decisions, reducing CLD resources and development timelines. Currently, final cell line selection relies solely on clone screening data generated in the late-stage CLD mini-bioreactor (MBR) fed-batch production runs, underutilising the limited earlier CLD scale-up and clone screening data of single-cell cloning (SSC) systems, well plates, T25 cell culture flasks. The hybrid model leverages this micro- and small-scale early-stage CLD data and ML to forecast growth, titre, and metabolite profiles later observed in the MBR fed-batch production runs, for each cell line. This provides a significant knowledge gain about the expected metabolism, productivity and cell growth of cell lines in production runs, offering decision-support in earlier CLD cell line selection. Broadly, the hybrid model addresses the common challenge in bioprocesses of scale-down modelling, as it has been extremely challenging to extrapolate cell line behaviour from small-scale wellplates and flasks to the bioreactor. The hybrid model presented herein overcomes this challenge by utilising this small-scale data to forecast cell line behaviour in fed-batch production runs. The thesis describes the gradual development of the hybrid model, commencing with a multivariate data analysis upon four historical CLD campaigns, to identify key cell line selection criteria and correlations amongst the multi-scale clone screening CLD stages. Subsequently, the development of the mechanistic part of the hybrid, the multi-cell line kinetic model (MCKM), is described which is a unique kinetic model that it is able to describe 140 different CHO cell lines, recombinant for three unique mAbs. The MCKM merely requires one cell line cultivation of 49 datapoints per regression and effectively describes each cell line cultivation into a set of 13 kinetic parameters, encapsulating the unique metabolic characteristics of each cell line. Thereafter, the developed hybrid model is presented that utilises ML to predict the kinetic parameters for each cell line in the MBR, using early-CLD SSC, well-plate and T25 flask clone screening data as input. The predicted kinetic parameters are then fed into the MCKM to simulate and forecast the metabolite profiles for each cell line. The hybrid was particularly accurate for cell growth, mAb titre and ammonium profiles. Finally, the effectiveness of the hybrid for autonomous selection of the lead clone in CLD was demonstrated, where the hybrid was able to identify 90.5% of the test cell lines correctly that will perform poorly in fed-batch production runs, and should be disregarded in early CLD.
