Repository logo
 

Recovery and quality estimation of metagenomic assembled genomes of eukaryotes


Type

Thesis

Change log

Authors

Saary, Paul 

Abstract

Microorganisms are found in virtually all environments, and while the majority of microorganisms are often prokaryotes, by biomass there are suspected to be 6 times more prokaryotes than fungi globally (Bar-On et al., 2018), eukaryotes are also important constituents of microbial communities (e.g on the human skin). Shotgun metagenomics can provide access to the combined genetic information of a community in a culture independent manner. Using de novo assembly and post processing methods, it can lead to the generation of so called metagenomic assembled genomes (MAGs), which provide contextualised access to genes of these elusive organisms. As most studies have focused on the recovery of prokaryotic MAGs, I first examined the limitations and gaps of existing tools with respect to their ability to recover microbial eukaryotic genomes. This led to the development of EukCC, a software to estimate the completeness and contamination of eukaryotic MAGs. Evaluation of this software showed that it is well suited for the fully automated recovery of eukaryotic MAGs. This workflow was applied to dataset obtained from several biomes to recover eukaryotic MAGs. However, I also demonstrate that eukaryotic MAGs can sometimes be fragmented and developed a merging algorithm to create merged MAGs (mMAGs). With the implementation of this algorithm in EukCC 2, I search a large number of datasets from MGnify for known and novel eukaryotic MAGs. Completing the eukaryotic MAG recovery process, I discuss how species-level dereplication for eukaryotes can be approached based on the genetic information alone. In summary I show that recovery of eukaryotic MAGs is a challenging but can be largely automated allowing large-scale studies to be performed.

Description

Date

2022-06-07

Advisors

Finn, Robert

Keywords

metagenomics, bioinformatics, microbiology, metagenomic assembled genomes, eukaryote

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
EMBL PhD Programm