An accurate assignment test for extremely low-coverage whole-genome sequence data.
Authors
Publication Date
2022-05Journal Title
Mol Ecol Resour
ISSN
1755-098X
Publisher
Wiley
Language
en
Type
Article
This Version
AO
VoR
Metadata
Show full item recordCitation
Ferrari, G., Atmore, L. M., Jentoft, S., Jakobsen, K. S., Makowiecki, D., Barrett, J. H., & Star, B. (2022). An accurate assignment test for extremely low-coverage whole-genome sequence data.. Mol Ecol Resour https://doi.org/10.1111/1755-0998.13551
Abstract
Genomic assignment tests can provide important diagnostic biological characteristics, such as population of origin or ecotype. Yet, assignment tests often rely on moderate- to high-coverage sequence data that can be difficult to obtain for fields such as molecular ecology and ancient DNA. We have developed a novel approach that efficiently assigns biologically relevant information (i.e., population identity or structural variants such as inversions) in extremely low-coverage sequence data. First, we generate databases from existing reference data using a subset of diagnostic single nucleotide polymorphisms (SNPs) associated with a biological characteristic. Low-coverage alignment files are subsequently compared to these databases to ascertain allelic state, yielding a joint probability for each association. To assess the efficacy of this approach, we assigned haplotypes and population identity in Heliconius butterflies, Atlantic herring, and Atlantic cod using chromosomal inversion sites and whole-genome data. We scored both modern and ancient specimens, including the first whole-genome sequence data recovered from ancient Atlantic herring bones. The method accurately assigns biological characteristics, including population membership, using extremely low-coverage data (as low as 0.0001x) based on genome-wide SNPs. This approach will therefore increase the number of samples in evolutionary, ecological and archaeological research for which relevant biological information can be obtained.
Keywords
chromosomal inversion, ecotype, genome skimming, haplotype, population assignment, Animals, Butterflies, Ecotype, Gadus morhua, Genome, Haplotypes, Polymorphism, Single Nucleotide, Sequence Analysis, DNA
Sponsorship
The Aqua Genome Project (221734/O30)
Catching the Past (262777)
Marie Skłodowska‐Curie (813383)
Identifiers
men13551
External DOI: https://doi.org/10.1111/1755-0998.13551
This record's URL: https://www.repository.cam.ac.uk/handle/1810/331473
Rights
Licence:
http://creativecommons.org/licenses/by-nc/4.0/
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk