Repository logo

A compendium of 32,277 metagenome-assembled genomes and over 80 million genes from the early-life human gut microbiome.

Published version



Change log


Zeng, Shuqin 
Patangia, Dhrati 
Zhou, Zhemin 


Age-specific reference genomes of the human gut microbiome can provide higher resolution for metagenomic analyses including taxonomic classification, strain-level genomic investigation and functional characterization. We present the Early-Life Gut Genomes (ELGG) catalog with 32,277 genomes representing 2172 species from 6122 fecal metagenomes collected from children under 3 years old spanning delivery mode, gestational age, feeding pattern, and geography. The ELGG substantially expanded the phylogenetic diversity by 38% over the isolate microbial genomes, and the genomic landscape of the early-life microbiome by increasing recruitment of metagenomic reads to 82.8%. More than 60% of the ELGG species lack an isolate representative. The conspecific genomes of the most abundant species from children differed in gene diversity and functions compared to adults. The ELGG genomes encode over 80 million protein sequences, forming the Early-Life Gut Proteins (ELGP) catalog with over four million protein clusters, 29.5% of which lacked functional annotations. The ELGG and ELGP references provided new insights into the early-life human gut microbiome and will facilitate studies to understand the development and mechanisms of disturbances of the human gut microbiome in early life.



Adult, Child, Child, Preschool, Gastrointestinal Microbiome, Humans, Metagenome, Metagenomics, Microbiota, Phylogeny

Journal Title

Nat Commun

Conference Name

Journal ISSN


Volume Title



Springer Science and Business Media LLC
MRC (MR/W016184/1)