Joint Network Modeling of Omics Data for Understanding Complex Diseases
Repository URI
Repository DOI
Change log
Authors
Abstract
In recent years, network models have become increasingly important for the analysis of molecular data due to their ability to represent complex interplay within biological sys- tems. The availability of diverse molecular data sources, stemming from advancements in high-throughput genomic technologies, encourages the development of more sophisticated models. Through the simultaneous analysis of multiple data sets or types, joint network approaches enhance statistical power and provide a route towards a deeper understanding of the mechanisms that underlie biological processes. This thesis proposes statistical approaches for joint network inference using both frequentist and Bayesian approaches, with a focus on scalability to high-dimensional data.
Developing a flexible and scalable framework for multiple network inference is challenging, and existing methods often excel in only one aspect. I propose a novel penalty selection procedure for the widely used joint graphical lasso that maintains high performance for high-dimensional data, and I demonstrate the potential of the method on proteomic data from a pan-cancer study.
While popular for network modelling due to desirable properties, Bayesian estimators such as the graphical horseshoe face scalability issues. Although robust and flexible, the current formulation of the graphical horseshoe estimator does not allow for simultaneous inference of multiple networks. In the second part of my thesis, I propose the joint graphical horseshoe estimator, which facilitates information sharing across networks and employs a fast expectation conditional maximization algorithm. I leverage the unique joint modelling properties of the approach to clarify gene regulation in immune-related disease pathogenesis.
Finally, with a view towards precision medicine, I present a systematic approach for assessing and characterising differences in biological function in the omics networks of multiple clinical groups. Applying this strategy to proteomic data from a phase II clinical trial investigating the effect of neoadjuvant therapy in breast carcinomas, I identify potential mechanisms for disease progression and treatment response.
In summary, throughout this thesis I develop a range of statistical techniques to study biological networks derived from omics data, with applications encompassing cancer research.