Repository logo

Molecular evolution of biological sequences



Change log


Vázquez García, Ignacio  ORCID logo


Evolution is an ubiquitous feature of living systems. The genetic composition of a population changes in response to the primary evolutionary forces: mutation, selection and genetic drift. Organisms undergoing rapid adaptation acquire multiple mutations that are physically linked in the genome, so their fates are mutually dependent and selection only acts on these loci in their entirety. This aspect has been largely overlooked in the study of asexual or somatic evolution and plays a major role in the evolution of bacterial and viral infections and cancer. In this thesis, we put forward a theoretical description for a minimal model of evolutionary dynamics to identify driver mutations, which carry a large positive fitness effect, among passenger mutations that hitchhike on successful genomes. We examine the effect this mode of selection has on genomic patterns of variation to infer the location of driver mutations and estimate their selection coefficient from time series of mutation frequencies. We then present a probabilistic model to reconstruct genotypically distinct lineages in mixed cell populations from DNA sequencing. This method uses Hidden Markov Models for the deconvolution of genetically diverse populations and can be applied to clonal admixtures of genomes in any asexual population, from evolving pathogens to the somatic evolution of cancer. To understand the effects of selection on rapidly adapting populations, we constructed sequence ensembles in a recombinant library of budding yeast (S. cerevisiae). Using DNA sequencing, we characterised the directed evolution of these populations under selective inhibition of rate-limiting steps of the cell cycle. We observed recurrent patterns of adaptive mutations and characterised common mutational processes, but the spectrum of mutations at the molecular level remained stochastic. Finally, we investigated the effect of genetic variation on the fate of new mutations, which gives rise to complex evolutionary dynamics. We demonstrate that the fitness variance of the population can set a selective threshold on new mutations, setting a limit to the efficiency of selection. In summary, we combined statistical analyses of genomic sequences, mathematical models of evolutionary dynamics and experiments in molecular evolution to advance our understanding of rapid adaptation. Our results open new avenues in our understanding of population dynamics that can be translated to a range of biological systems.





Mustonen, Ville


statistical inference, genome sequencing, population genomics, clonal evolution, mutation, selection, microbial evolution, cancer evolution


Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
The work presented in this thesis has been supported by the Wellcome Trust as part of the PhD programme in Mathematical Genomics and Medicine at the University of Cambridge, by the Sanger Early Career Innovation Award, by Fundación Ibercaja, and partially supported by the National Science Foundation during a research stay at the Kavli Institute for Theoretical Physics (University of California, Santa Barbara).