Semi-Supervised Non-Parametric Bayesian Modelling of Spatial Proteomics
Journal Title
Annals of Applied Statistics
ISSN
1932-6157
Publisher
Institute of Mathematical Statistics
Type
Article
This Version
AM
Metadata
Show full item recordCitation
Crook, O., Lilley, K., Gatto, L., & Kirk, P. Semi-Supervised Non-Parametric Bayesian Modelling of Spatial Proteomics. Annals of Applied Statistics https://doi.org/10.17863/CAM.83785
Abstract
Understanding sub-cellular protein localisation is an essential component to
analyse context specific protein function. Recent advances in quantitative
mass-spectrometry (MS) have led to high resolution mapping of thousands of
proteins to sub-cellular locations within the cell. Novel modelling
considerations to capture the complex nature of these data are thus necessary.
We approach analysis of spatial proteomics data in a non-parametric Bayesian
framework, using mixtures of Gaussian process regression models. The Gaussian
process regression model accounts for correlation structure within a
sub-cellular niche, with each mixture component capturing the distinct
correlation structure observed within each niche. Proteins with a priori
labelled locations motivate using semi-supervised learning to inform the
Gaussian process hyperparameters. We moreover provide an efficient
Hamiltonian-within-Gibbs sampler for our model. As in other recent work, we
reduce the computational burden associated with inversion of covariance
matrices by exploiting the structure in the covariance matrix. A tensor
decomposition of our covariance matrices allows extended Trench and Durbin
algorithms to be applied it order to reduce the computational complexity of
inversion and hence accelerate computation. A stand-alone R-package
implementing these methods using high-performance C++ libraries is available
at: https://github.com/ococrook/toeplitz
Keywords
stat.AP, stat.AP
Sponsorship
Biotechnology and Biological Sciences Research Council (BB/N023129/1)
Wellcome Trust (110170/Z/15/Z)
Embargo Lift Date
2025-04-22
Identifiers
This record's DOI: https://doi.org/10.17863/CAM.83785
This record's URL: https://www.repository.cam.ac.uk/handle/1810/336368
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.