Repository logo

Bayesian profile regression for clustering analysis involving a longitudinal response and explanatory variables.

Published version

Repository DOI



Change log


Rouanet, Anaïs 
Johnson, Rob 
Strauss, Magdalena 
Richardson, Sylvia 
Tom, Brian D 


The identification of sets of co-regulated genes that share a common function is a key question of modern genomics. Bayesian profile regression is a semi-supervised mixture modelling approach that makes use of a response to guide inference toward relevant clusterings. Previous applications of profile regression have considered univariate continuous, categorical, and count outcomes. In this work, we extend Bayesian profile regression to cases where the outcome is longitudinal (or multivariate continuous) and provide PReMiuMlongi, an updated version of PReMiuM, the R package for profile regression. We consider multivariate normal and Gaussian process regression response models and provide proof of principle applications to four simulation studies. The model is applied on budding yeast data to identify groups of genes co-regulated during the Saccharomyces cerevisiae cell cycle. We identify 4 distinct groups of genes associated with specific patterns of gene expression trajectories, along with the bound transcriptional factors, likely involved in their co-regulation process.



49 Mathematical Sciences, 31 Biological Sciences, 4905 Statistics, Genetics, 1.1 Normal biological development and functioning, 1 Underpinning research, Generic health relevance

Journal Title

Methodology (Gott)

Conference Name

Journal ISSN


Volume Title


Oxford University Press (OUP)
National Institute for Health and Care Research (IS-BRC-1215-20014)