Semi-Supervised Non-Parametric Bayesian Modelling of Spatial Proteomics.
Authors
Crook, Oliver M
Lilley, Kathryn S
Gatto, Laurent
Kirk, Paul DW
Publication Date
2022-12-01Journal Title
Ann Appl Stat
ISSN
1932-6157
Publisher
Institute of Mathematical Statistics
Type
Article
This Version
AM
Metadata
Show full item recordCitation
Crook, O. M., Lilley, K. S., Gatto, L., & Kirk, P. D. (2022). Semi-Supervised Non-Parametric Bayesian Modelling of Spatial Proteomics.. Ann Appl Stat https://doi.org/10.1214/22-AOAS1603
Abstract
Understanding sub-cellular protein localisation is an essential component in the analysis of context specific protein function. Recent advances in quantitative mass-spectrometry (MS) have led to high resolution mapping of thousands of proteins to sub-cellular locations within the cell. Novel modelling considerations to capture the complex nature of these data are thus necessary. We approach analysis of spatial proteomics data in a non-parametric Bayesian framework, using K-component mixtures of Gaussian process regression models. The Gaussian process regression model accounts for correlation structure within a sub-cellular niche, with each mixture component capturing the distinct correlation structure observed within each niche. The availability of marker proteins (i.e. proteins with a priori known labelled locations) motivates a semi-supervised learning approach to inform the Gaussian process hyperparameters. We moreover provide an efficient Hamiltonian-within-Gibbs sampler for our model. Furthermore, we reduce the computational burden associated with inversion of covariance matrices by exploiting the structure in the covariance matrix. A tensor decomposition of our covariance matrices allows extended Trench and Durbin algorithms to be applied to reduce the computational complexity of inversion and hence accelerate computation. We provide detailed case-studies on Drosophila embryos and mouse pluripotent embryonic stem cells to illustrate the benefit of semi-supervised functional Bayesian modelling of the data.
Keywords
stat.AP, stat.AP
Sponsorship
Biotechnology and Biological Sciences Research Council (BB/N023129/1)
Wellcome Trust (110170/Z/15/Z)
Identifiers
External DOI: https://doi.org/10.1214/22-AOAS1603
This record's URL: https://www.repository.cam.ac.uk/handle/1810/336368
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk