Repository logo
 

Analysis of the understudied parts of the phospho-signalome using machine learning methods

cam.depositDate2022-02-18
cam.restrictionthesis_access_open
cam.supervisorPetsalaki, Evangelia
cam.thesis.confidentialfalse
cam.thesis.confidential-clearanceNone - this thesis does not contain confidential and / or sensitive information
cam.thesis.copyrightfalse
cam.thesis.copyright-clearanceNo copyright - this thesis does not include material with third party copyright
dc.contributor.authorPetursson, Borgthor
dc.date.accessioned2022-04-21T15:54:46Z
dc.date.available2022-04-21T15:54:46Z
dc.date.submitted2021-08-28
dc.date.updated2022-02-18T12:34:03Z
dc.description.abstractAbstract Analysis of the understudied parts of the phospho-signalome using machine learning methods Borgthor Petursson In order to make decisions and respond appropriately to external stimuli, cells rely on an intricate signalling system. One of the most important and best studied components of this signalling system is the phospho-signalling network. Phosphorylation relays information through adding phosphoryl groups onto substrates such as lipids or proteins, which in turn leads to changes in substrate function. Crucial components of this system include kinases, which phosphorylate on the substrate molecule and phosphatases that remove the phosphoryl group from the substrate. To date, even though >100K phosphoproteins have been identified through high throughput experiments, the vast majority of phosphosites are of unknown function, while over a third of kinases have no known substrate (Needham et al., 2019). Furthermore, there is a large study bias in our current knowledge, demonstrated by a disproportionate number of interactions between highly cited kinases and substrates Invergo and Beltrao, 2018. The vast understudied signalling space combined with this study bias make it difficult to understand the general principles underpinning cell signalling regulation and stresses the need to research the phosphoproteomic signalling system in an unbiased manner. In this thesis the central aim is to use data-driven and unbiased approaches to study the human phosphoproteomic signalling network. The first chapter describes a project where I co-developed a machine learning model to predict signed kinase-kinase regulatory circuits based on kinase specificities and high throughput phosphoproteomics and transcriptomic data. The network was validated using independent high throughput data and used to identify novel kinase-kinase regulatory interactions. This project was done in collaboration with Brandon Invergo, a postdoc in Pedro Beltrao’s research group. In the second chapter I expand upon work done in the first chapter. I used various predictors such as: Co-expression, kinase specificities and different variables characterising kinase-substrate potential target phosphosites to predict kinase-substrate relationships and their signs. I then used independent experimental kinase-substrate predictions to validate the predictions and identify high confidence kinase-substrate relationships. I then combined the kinase-substrate predictions with the kinase-kinase regulatory circuits to identify condition-specific signalling networks. To enable easy use of my method and networks and analyses of phosphoproteomics data by non-expert users I also developed the SELPHI2 server, where the user can extract biological insight from their datasets. SELPHI2 presents a substantial improvement upon the SELPHI server, which was developed in 2015 by my supervisor, Evangelia Petsalaki. Thirdly, to study the architecture of human cell signalling networks at a whole-cell level and address the limited predictive power of the current models of cell signalling such as pathways found in KEGG (Kanehisa, 2019), Reactome (Jassal et al., 2020) and WikiPathways (Slenter et al., 2018), the third chapter aims to identify signalling modules from phosphoproteomic data. These data-extracted modules were found to have a greater predictive power for independent data sets in terms of number of significant enrichments. Furthermore, we sought to predict the probability of module co-membership from predictors such as membership within data-driven modules, co-phosphorylation and co-expression. In summary, the work presented here seeks to explore the understudied phospho-signalling systems through system-wide prediction of kinase-substrate regulation and the identification of phospho-signalling modules through data-driven means.
dc.identifier.doi10.17863/CAM.83743
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/336324
dc.language.isoeng
dc.publisher.collegeMagdalene
dc.publisher.institutionEMBL-EBI
dc.rightsAttribution 4.0 International (CC BY 4.0)
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectCell signalling
dc.subjectphospho-signalling
dc.subjectmachine learning
dc.titleAnalysis of the understudied parts of the phospho-signalome using machine learning methods
dc.typeThesis
dc.type.qualificationlevelDoctoral
dc.type.qualificationnameDoctor of Philosophy (PhD)
pubs.licence-display-nameApollo Repository Deposit Licence Agreement
pubs.licence-identifierapollo-deposit-licence-2-1
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by/4.0/
rioxxterms.typeThesis

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Thesis_BP_EBI_bkg_revision_fin.pdf
Size:
4.73 MB
Format:
Adobe Portable Document Format
Description:
Thesis
Licence
https://creativecommons.org/licenses/by/4.0/