Repository logo
 

Investigating the genetic diversity, population structure and archaic admixture history in worldwide human populations using high-coverage genomes


Type

Thesis

Change log

Abstract

I present the analysis on 929 high-coverage (>30x) genomes from the Human Genome Diversity Project (HGDP) panel, a collection of cell lines from 54 populations across the world. Some data processing steps were necessary for downstream analysis, including lifting over resources on a different reference genome assembly, annotating the genome, and statistical phasing. Genome-wide genetic diversity conforms with previous studies using SNP arrays and microsatellites, yet haplotype information reveals fine scale structures and recent demographic history that vary between populations. This dataset also provides a valuable opportunity to explore the diversity and distribution of archaic segments in modern human populations. I implemented a hidden Markov model to detect such segments, based on patterns of allele-sharing with sequenced archaic genomes and a sub-Saharan African control panel. I also compared several variants of the model and different training methods using simulated data. Applying the model on the HGDP dataset using two Neanderthal genomes and one Denisova genome, I detected variations in the level of archaic ancestry across continental regions, populations, and individuals within each population. I further compared Neanderthal and Denisovan segments regarding their lengths, genomic distribution, divergence to the archaic genomes, nucleotide diversity, and haplotype networks to shed light on the structure of the admixture events. Neanderthal segments from all non-African populations appear largely homogeneous after accounting for the recent demographic history of modern human populations, which is consistent with a single admixture event that happened before they diverged from each other. In contrast, a distinct separation exists between Denisovan haplotypes recovered from Oceania and those from East/South Asia, whilst the complicated structure in the latter cannot be explained by a single source of gene flow. Therefore I propose that more than one episode of admixture with different Denisova groups occurred in the ancestral population of present-day East Asian, South Asian and American populations after the separation from the ancestors of present-day Oceanians, and that a separate admixture event occurred between the ancestors of Oceanians and the Denisova population.

Description

Date

2019-08-02

Advisors

Scally, Aylwyn

Keywords

population genetics, human genetics, archaic introgression, human genome diversity project

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
Gates Cambridge