Repository logo
 

Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes

Published version
Peer-reviewed

Change log

Authors

Hui, Ruoyun 
D’Atanasio, Eugenia 
Cassidy, Lara M. 
Scheib, Christiana L. 
Kivisild, Toomas 

Abstract

Abstract: Although ancient DNA data have become increasingly more important in studies about past populations, it is often not feasible or practical to obtain high coverage genomes from poorly preserved samples. While methods of accurate genotype imputation from > 1 × coverage data have recently become a routine, a large proportion of ancient samples remain unusable for downstream analyses due to their low coverage. Here, we evaluate a two-step pipeline for the imputation of common variants in ancient genomes at 0.05–1 × coverage. We use the genotype likelihood input mode in Beagle and filter for confident genotypes as the input to impute missing genotypes. This procedure, when tested on ancient genomes, outperforms a single-step imputation from genotype likelihoods, suggesting that current genotype callers do not fully account for errors in ancient sequences and additional quality controls can be beneficial. We compared the effect of various genotype likelihood calling methods, post-calling, pre-imputation and post-imputation filters, different reference panels, as well as different imputation tools. In a Neolithic Hungarian genome, we obtain ~ 90% imputation accuracy for heterozygous common variants at coverage 0.05 × and > 97% accuracy at coverage 0.5 ×. We show that imputation can mitigate, though not eliminate reference bias in ultra-low coverage ancient genomes.

Description

Funder: Sapienza Università di Roma

Keywords

Article, /631/181/19, /631/181/27, /631/181/2474, /631/181/457, article

Journal Title

Scientific Reports

Conference Name

Journal ISSN

2045-2322

Volume Title

10

Publisher

Nature Publishing Group UK
Sponsorship
Wellcome Trust (2000368/Z/15/Z)