Repository logo
 

Duplication is a prominent mechanism of recent gene birth in Caenorhabditis elegans


Type

Thesis

Change log

Authors

Abstract

The high number of available reference genomes for different species and their comparison has enabled the elucidation of gene birth mechanisms that act over a long evolutionary timescale. However, the lack of several reference-quality genomes for different individuals of the same species has hampered the study of the mechanisms of more evolutionarily young gene births. Despite the high throughput brought about by second-generation sequencing technologies, their short read length has limited the study of genetic diversity to single nucleotide polymorphisms (SNPs) and short indels. However, in order to study gene-level events, we need to characterise the genetic diversity of a species comprehensively, including structural variants (SVs) (> 50 bp). I present the most comprehensive set of genomes and SVs for Caenorhabditis elegans. I have assembled a high-quality genome for each of 20 wild isolates of the nematode using long and short read sequencing. I show that 1,587 transcripts are deleted among the wild isolates and thus sketch the  first definition of the core genome of C. elegans. I present the case of a highly proliferative transposon harbouring a transcription factor binding site (TFBS) and use it to address the question of transposon co-option in this model organism. Finally, using this dataset, I show that tandem gene duplication is a prominent gene birth mechanism, whereas horizontal gene transfer (HGT) played little or no role in the birth of recent C. elegans genes. Additionally, I show that G protein-coupled receptors (GPCRs) have high levels of presence/absence variation (PAV) and discuss the significance of this  finding in light of the ecology of this little worm.

Description

Date

2021-06-01

Advisors

Hemberg, Martin
Miska, Eric Alexander

Keywords

genomics, biology, sequencing, DNA, evolution, gene birth, PacBio, Pacific Biosciences, long reads, genomes, genome assembly, bioinformatics

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge
Sponsorship
Wellcome