Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes


Change log
Authors
Hui, Ruoyun 
D’Atanasio, Eugenia 
Cassidy, Lara M. 
Scheib, Christiana L. 
Kivisild, Toomas 
Abstract

Abstract: Although ancient DNA data have become increasingly more important in studies about past populations, it is often not feasible or practical to obtain high coverage genomes from poorly preserved samples. While methods of accurate genotype imputation from > 1 × coverage data have recently become a routine, a large proportion of ancient samples remain unusable for downstream analyses due to their low coverage. Here, we evaluate a two-step pipeline for the imputation of common variants in ancient genomes at 0.05–1 × coverage. We use the genotype likelihood input mode in Beagle and filter for confident genotypes as the input to impute missing genotypes. This procedure, when tested on ancient genomes, outperforms a single-step imputation from genotype likelihoods, suggesting that current genotype callers do not fully account for errors in ancient sequences and additional quality controls can be beneficial. We compared the effect of various genotype likelihood calling methods, post-calling, pre-imputation and post-imputation filters, different reference panels, as well as different imputation tools. In a Neolithic Hungarian genome, we obtain ~ 90% imputation accuracy for heterozygous common variants at coverage 0.05 × and > 97% accuracy at coverage 0.5 ×. We show that imputation can mitigate, though not eliminate reference bias in ultra-low coverage ancient genomes.

Description

Funder: Sapienza Università di Roma

Keywords
Article, /631/181/19, /631/181/27, /631/181/2474, /631/181/457, article
Journal Title
Scientific Reports
Conference Name
Journal ISSN
2045-2322
Volume Title
10
Publisher
Nature Publishing Group UK
Sponsorship
Wellcome Trust (2000368/Z/15/Z)