Understanding Disease and Disease Relationships Using Transcriptomic Data

Oerton, Erin

Understanding Disease and Disease Relationships Using Transcriptomic Data

Repository URI

https://www.repository.cam.ac.uk/handle/1810/289128

Repository DOI

https://doi.org/10.17863/CAM.36391

Files

Thesis (9.3 MB)

Type

Thesis

Authors

Oerton, Erin

Abstract

As the volume of transcriptomic data continues to increase, so too does its potential to deepen our understanding of disease; for example, by revealing gene expression patterns shared between diseases. However, key questions remain around the strength of the transcriptomic signal of disease and the identification of meaningful commonalities between datasets, which are addressed in this thesis as follows.

The first chapter, Concordance of Microarray Studies of Parkinson’s Disease, examines the agreement between differential expression signatures across 33 studies of Parkinson’s disease. Comparison of these studies, which cover a range of microarray platforms, tissues, and disease models, reveals a characteristic pattern of differential expression in the most highly-affected tissues in human patients. Using correlation and clustering analyses to measure the representativeness of different study designs to human disease, the work described acts as a guideline for the comparison of microarray studies in the following chapters.

In the next chapter, Using Dysregulated Signalling Paths to Understand Disease, gene expression changes are linked on the human signalling network, enabling identification of network regions dysregulated in disease. Applying this method across a large dataset of 141 common and rare diseases identifies dysregulated processes shared between diverse conditions, which relate to known disease- and drug-sharing-relationships.

The final chapter, Understanding and Predicting Disease Relationships Through Similarity Fusion, explores the integration of gene expression with other data types – in this case, ontological, phenotypic, literature co-occurrence, genetic, and drug data – to understand relationships between diseases. A similarity fusion approach is proposed to overcome the differences in data type properties between each space, resulting in the identification of novel disease relationships spanning multiple bioinformatic levels. The similarity of disease relationships between each data type is considered, revealing that relationships in differential expression space are distinct from those in other molecular and clinical spaces.

In summary, the work described in this thesis sets out a framework for the comparative analysis of transcriptomic data in disease, including the integration of biological networks and other bioinformatic data types, in order to further our knowledge of diseases and the relationships between them.

Date

2018-09-04

Advisors

Bender, Andreas

Keywords

transcriptomics, gene expression, disease relationships, biological data integration, gene expression microarray

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights

Sponsorship

PhD funded by the Biotechnology and Biological Sciences Research Council Doctoral Training Partnership

Collections

Theses - Chemistry