Repository logo
 

Curation, characterisation and prediction of Drosophila signalling pathway members


Type

Thesis

Change log

Authors

Abstract

Signalling pathways are key to virtually every aspect of the biology of multicellular organisms. Extensive research in Drosophila melanogaster has greatly contributed to the understanding of these pathways, but a central resource distilling the vast literature on the topic has been lacking. At the same time, there is now a large amount of publicly available functional genomics data in Drosophila that, if appropriately analysed, might be able to contribute to further progress in the study of signalling pathways. Here, I describe an effort to systematize what is currently known about which genes are part of Drosophila pathways and use the resulting resource as the foundation for machine learning analyses, aiming to address whether existing data can be used to predict novel pathway members.

First, I describe my contribution to a systematic review of the literature on Drosophila signalling pathways. High-confidence lists of member genes were established for 16 pathways, and annotated using the Gene Ontology controlled vocabulary. The results of this review have been presented in a publicly available resource in the FlyBase database. Second, I performed analyses of various published data aiming to characterise the biological properties of genes within pathways. These analyses showed that members of a given pathway have correlated mRNA expression profiles and higher numbers of both physical and genetic interactions with each other than expected by chance, but do not show strong trends of having arisen in the same period during the history of life. Pathway members also have fewer loss-of-function variants in natural Drosophila populations than other genes, highlighting their biological importance. Third, I established a machine learning pipeline that makes use of these various types of data to predict new candidate pathway members, using the annotated members as positive training examples. The predictions displayed high accuracy in recognising true annotated members held out from training, suggesting that the predicted new members are useful candidates for future experimental work.

Overall, the work presented here highlights the importance of systematic curation of published findings to biological research. It also demonstrates how such curation, when combined with computational analyses of published data, can contribute to continued progress in the study of Drosophila signalling pathways.

Description

Date

2022-01-01

Advisors

Brown, Nicholas H

Keywords

Signalling pathways, Drosophila, Gene Ontology, Machine learning, Biocuration

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge