Repository logo
 

Bayesian Nonparametric Ordination for the Analysis of Microbial Communities.

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Ren, Boyu 
Bacallado, Sergio 
Favaro, Stefano 
Holmes, Susan 
Trippa, Lorenzo 

Abstract

Human microbiome studies use sequencing technologies to measure the abundance of bacterial species or Operational Taxonomic Units (OTUs) in samples of biological material. Typically the data are organized in contingency tables with OTU counts across heterogeneous biological samples. In the microbial ecology community, ordination methods are frequently used to investigate latent factors or clusters that capture and describe variations of OTU counts across biological samples. It remains important to evaluate how uncertainty in estimates of each biological sample's microbial distribution propagates to ordination analyses, including visualization of clusters and projections of biological samples on low dimensional spaces. We propose a Bayesian analysis for dependent distributions to endow frequently used ordinations with estimates of uncertainty. A Bayesian nonparametric prior for dependent normalized random measures is constructed, which is marginally equivalent to the normalized generalized Gamma process, a well-known prior for nonparametric analyses. In our prior, the dependence and similarity between microbial distributions is represented by latent factors that concentrate in a low dimensional space. We use a shrinkage prior to tune the dimensionality of the latent factors. The resulting posterior samples of model parameters can be used to evaluate uncertainty in analyses routinely applied in microbiome studies. Specifically, by combining them with multivariate data analysis techniques we can visualize credible regions in ecological ordination plots. The characteristics of the proposed model are illustrated through a simulation study and applications in two microbiome datasets.

Description

Keywords

Bayesian factor analysis, Dependent Dirichlet processes, Microbiome data analysis, Uncertainty of ordination

Journal Title

J Am Stat Assoc

Conference Name

Journal ISSN

0162-1459
1537-274X

Volume Title

Publisher

Informa UK Limited
Sponsorship
B. Ren is supported by National Science Foundation under Grant No. DMS-1042785. S. Favaro is supported by the European Research Council (ERC) through StG N-BNP 306406. L. Trippa has been supported by the Claudia Adams Barr Program in Innovative Basic Cancer Research. S. Holmes was supported by the NIH grant R01AI112401.