Repository logo
 

Annealed variational mixtures for disease subtyping and biomarker discovery.

Accepted version
Peer-reviewed

Change log

Abstract

Cluster analyses of high-dimensional data are often hampered by the presence of large numbers of variables that do not provide relevant information, as well as the perennial issue of choosing an appropriate number of clusters. These challenges are frequently encountered when analysing omics datasets, such as in molecular precision medicine, where a key goal is to identify disease subtypes and the biomarkers that define them. Here we introduce an annealed variational Bayes algorithm for fitting high-dimensional mixture models while performing variable selection. Our algorithm is scalable and computationally efficient, and we provide an open source Python implementation, VBVarSel. In a range of simulated and real biomedical examples, we show that VBVarSel outperforms the current state of the art, and demonstrate its use for cancer subtyping and biomarker discovery.

Description

Journal Title

Stat Appl Genet Mol Biol

Conference Name

Journal ISSN

2194-6302
1544-6115

Volume Title

25

Publisher

De Gruyter

Rights and licensing

Except where otherwised noted, this item's license is described as Attribution 4.0 International
Sponsorship
Engineering and Physical Sciences Research Council (EP/R018561/1)