Repository logo
 

Quantitative modelling of CRISPR-Cas editing outcomes


Type

Thesis

Change log

Authors

Pallaseni, Ananth 

Abstract

Development of the CRISPR-Cas toolkit over the last decade has enabled unprecedented control over the genome and unlocked the capacity for new experiments and gene-therapies. However, use of these technologies is made more difficult due to variance in the rate and type of individual genetic outcomes generated by each editor. This variance is reproducible and a function of the sequence being targeted, thus making it amenable to quantitative modelling. My research focuses on building such models in a variety of contexts.

In this thesis, I recount the history of gene editing from transgenesis to prime editors. I then review the modelling techniques I use in my projects, before covering the state of predictive modelling for genome engineering. In two main results chapters, I discuss my work on modelling base editor outcomes and the effects of DNA repair context on Cas9-induced double-stranded break repair. The final results chapter covers shorter, collaborative studies on other gene editing technologies. I will conclude by discussing the need for computational tools in genome editing and the gaps in our understanding.

In my first results chapter, I examine the sequence- and position- specificity of base editor activity. Base editors are a gene editing technology derived from Cas9 that introduce precise base substitutions into a targeted region of the genome. The rate of these substitutions is known to vary between targeted sequences and the determinants of this variation are not completely understood. To untangle the determinants of base editing efficacy, our group performed a large-scale screen where we measured base editing outcomes across 20,000 targeted sequences in multiple cell lines and editors. I processed and analysed the data produced in this experiment and found that both the sequence flanking editable bases and the position of those bases in the sequence affects the rate of observed editing. I leveraged this understanding to construct a position-specific model of base editing activity for each editor type and used these models to predict the e fficacy and specificity of base editors for correcting pathogenic variants found in ClinVar.

The second results chapter focuses on the mutational outcomes of Cas9-induced cuts in repair deficient backgrounds. Cas9 creates a double-stranded break at a targeted location in the genome and the cell repairs this lesion via several pathways which can leave mutations in the repaired sequence. It has been shown in previous studies that the distribution of these mutations is reproducible and dependent on the sequence being cut, but the effect of repair context on this process is not well understood. I planned an experiment to measure Cas9 repair outcomes at over 5000 target sites in 21 mouse cell lines with knockouts of single repair genes, then processed and analysed the data generated. I show that the knockout cells have reproducibly different repair patterns than controls. I highlight Nbn, Lig4 and PolQ as examples of knockouts with consistent effects on certain mutation types. I examine how the known sequence-determinants of Cas9 outcomes affect outcome preference knockout lines. Lastly, I use this understanding to train models that predict the distribution of Cas9 outcomes in various repair backgrounds.

My final results chapter discusses two shorter collaborative studies on alternative editing technologies. First is the design of a large scale screen to profile the behaviour of a new Cas enzyme. I explain the experimental process of profiling Cas outcomes, the decisions involved in designing a guide library and an approach to modelling. The other collaboration is the prediction of editing rates when inserting sequences into the genome with prime editors. Here, I train a model to predict editing rates, examine which features are most important to predictive performance, and finally determine that collection of more data for training will improve model performance.

Description

Date

2022-12-30

Advisors

Parts, Leopold

Keywords

Base editing, Cas9, CRISPR, DNA repair, Genetics

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
My research was funded in whole, or in part, by the Wellcome Trust Grant [206194]