Statistical analysis of short template switch mutations in human genomes
Many complex rearrangements arise in human genomes through template switch mutations, which occur during DNA replication when there is a transient polymerase switch to an alternate template nearby in three-dimensional space. These variants are routinely captured at kilobase-to-megabase scales in studies of genetic variation by using methods for structural variant calling. However, the genomic and evolutionary consequences of replication-based rearrangements remain poorly characterised at smaller scales, where they are usually interpreted as complex clusters of independent substitutions, insertions and deletions. In this thesis, I describe statistical methods for the detection and interpretation of short template switch mutations within DNA sequence data. I then use my methods to explore small-scale template switch mutagenesis within human genome evolution, population variation, and cancer. I show that small-scale, replication- based rearrangements are a ubiquitous feature of the germline and somatic mutational landscape of human genomes.