Repository logo

Computer modelling of metabolic adaptions during mitochondrial dysfunction and machine learning to predict novel mitochondrial disease genes

Change log


Smith, Alexander Gary 


Mitochondria are organelles found in almost every eukaryote and are primarily responsible for generating chemical energy in the form of adenosine triphosphate. This thesis investigates two main causes of mitochondrial dysfunction: mitochondrial toxicity arising from side-effects of drugs; and mitochondrial diseases arising from defects in nuclear-encoded genes.

Novel chemical entities being developed as drug leads are screened for cellular toxicity in which mitochondrial dysfunction is a major cause. However, our lack of understanding of the metabolic adaptations to mitochondrial dysfunction limits the accurate screening of mitochondrial dysfunction for pharmaceutical companies, thus preventing potentially useful drugs from being developed. To further our understanding of these adaptations, I analysed a large-scale metabolomics data set of rats administered a known mitochondrial complex III inhibitor. The analyses revealed many perturbed pathways which can be exploited as biomarkers of mild mitochondrial dysfunction, a condition which is currently clinically undetectable during the drug development process. To direct future studies on mitochondrial dysfunction, a multi-organ model of mitochondrial metabolism was generated and used to simulate inhibition of the mitochondrial respiratory complexes. The simulations of complex III inhibition accurately predicted many of the metabolite behaviours identified in the metabolomics analyses and provided theories for their significance. Simulations of the other complexes’ inhibitions identified many unique behaviours which can be used to direct future studies, studies which would greatly improve our understanding of the metabolic adaptations and provide higher confidence biomarkers.

Mitochondrial dysfunction is linked to many late onset diseases such as Parkinson’s, and inborn errors of mitochondrial metabolism cause severe neurological and physiological diseases. Patients with suspected mitochondrial disease have their DNA sequenced and analysed. Diagnosis of mitochondrial disease by sequencing requires knowledge of the mitochondrial proteome, which is currently incomplete. A predicted mitochondrial proteome was generated using a support vector machine trained using the abundance of protein localisation data available in the MitoMiner database. The support vector machine identified 442 novel mitochondrional proteins. The current success rate of diagnosing mitochondrial disease using sequencing is currently limited by our inability to filter and prioritise a patient’s DNA variants. Patients which do not have a variant in one of the already known mitochondrial disease genes are usually left with over hundreds of potential disease-causing variants. A probability of being disease-causing for each gene in the mitochondrial proteome was generated using two trained neural networks. The networks were trained on a large amount of different data sources for differentiating mitochondrial disease genes including protein-protein interaction network metrics, gene tissue expression and protein evolution. The predicted probabilities allow for better filtering and prioritisation of a patient’s variants for candidate disease-causing genes to be experimentally verified. The predicted mitochondrial proteome and their predicted disease-causing probabilities are currently used in an NGS analysis pipeline at the MRC Mitochondrial Biology Unit for diagnosing mitochondrial disease patient samples.





Robinson, Alan


Machine learning, Modelling, Mitochondria, Mitochondrial disease, Metabolomics


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge