Studies on approaches for sequencing cytosine modifications and the identification of their protein readers
Repository URI
Repository DOI
Change log
Authors
Abstract
Alongside the genetic code, epigenetic mechanisms including covalent cytosine modifications facilitate dynamic changes in gene expression throughout development and differentiation. Such modifications are also drivers and markers of disease, earning them great academic and clinical interest. A suite of sequencing methods allows base-resolution detection of all known cytosine modifications, but such approaches are often destructive and error-prone due to subtractive analysis of multiple experiments. Proteins binding cytosine modifications often interact with both marks in a CpG dyad, but current sequencing methods cannot inform on the double-stranded context of base modifications, meaning the prevalence of double-stranded states and their interactions with proteins are poorly understood. Chapter 2 of this thesis describes studies exploring genetic code expansion to allow simultaneous, direct detection of cytosine modifications alongside canonical bases. Design of an unnatural base pair comprising a 5-formylcytosine derivative with an altered Watson-Crick hydrogen bonding pattern was followed by biophysical confirmation of its pairing in duplex DNA. The unnatural base pair is tolerated by DNA polymerases, allowing optimisation of an expanded Sanger sequencing strategy to directly detect 5-formylcytosine and 5-hydroxymethylcytosine in DNA oligonucleotides. Chapter 3 builds on a novel sequencing method providing double-stranded information on cytosine modifications in CpG contexts, to identify proteins interacting selectively with different combinations of these marks. A mass spectrometry-based screen identified the epigenetic priming factors Dppa2 and Dppa4 as combination-specific interactors of 5-hydroxymethylcytosine, though the mechanism of these proteins’ interaction with 5-hydroxymethylcytosine requires further work to be conclusively determined.
