Repository logo

Mapping the Genomic Context of Mutagenesis



Change log


Vöhringer, Harald 


The accumulation of genomic mutations leads to the formation of cancer. For this reason, many efforts have been undertaken to characterise mutational processes in terms of their genomic imprints. A particularly successful approach is matrix-based mutational signature analysis, which identifies prototypical mutation patterns by applying non-negative matrix factorisation to catalogues of single nucleotide variants and other mutation types. However, mutagenesis is a multifaceted event that is affected by the genomic organisation of DNA and cellular processes such as transcription, replication, and DNA repair processes. Moreover, since many mutational processes also generate characteristic multi nucleotide variants, insertion and deletions, and structural variants, it appears valuable to jointly deconvolve broader mutational catalogues to better understand the complex nature of mutagenesis. In this thesis, I present TensorSignatures, an algorithm to learn mutational signatures jointly across different variant categories as well as their genomic localisation and properties. The analysis of 2,778 primary and 3,824 metastatic cancer genomes of the PCAWG consortium and the HMF cohort shows that practically all signatures operate dynamically in response to various genomic and epigenomic states. The analysis pins differential spectra of UV mutagenesis found in active and inactive chromatin to global genome nucleotide excision repair. TensorSignatures accurately characterises transcription-associated mutagenesis, which is detected in 7 different cancer types. The algorithm also extracts distinct signatures of replication- and double strand break repair-driven mutagenesis by APOBEC3A and 3B with differential numbers and length of mutation clusters. As a fourth example, TensorSignatures reproduces a signature of somatic hypermutation generating highly clustered variants around the transcription start sites of active genes in lymphoid leukaemia, distinct from a more general and less clustered signature of Polη-driven translesion synthesis found in a broad range of cancer types. Finally, I demonstrate TensorSignatures’ utility by applying it to multiple datasets in various collaboration projects. Taken together, TensorSignatures adds great detail and refines mutational signature analysis by jointly learning mutation patterns and their genomic determinants. This sheds light on the manifold influences that underlie mutagenesis and helps to pinpoint mutagenic influences which cannot easily be distinguished based on the mutation spectra alone. As mutational signature analysis is an essential element of the cancer genome analysis toolkit, TensorSignatures may help make the growing catalogues of mutational signatures more insightful by highlighting mutagenic mechanisms, or hypotheses thereof, to be investigated in greater depth.





Gerstung, Moritz


Mutational Signatures, Cancer, Genomics


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge