Quantitative genetics of gene expression during fruit fly development
Over the last ten years, genome-wide association studies (GWAS) have been used to identify genetic variants associated with many diseases as well as quantitative phenotypes, by exploiting naturally occurring genetic variation in large cohorts of individuals. More recently, the GWAS approach has also been applied to highthroughput RNA sequencing (RNA-seq) data in order to find loci associated with different levels of gene expression, called expression quantitative trait loci (eQTL). Because of the large amount of data that is required for such high-resolution eQTL studies, most of them have so far been carried out in humans, where the cost of data collection could be justified by a possible future impact in human health. However, due to the rapidly falling price of high-throughput sequencing it is now also becoming feasible to perform high-resolution eQTL studies in higher model organisms. This enables the study of gene regulation in biological contexts that have so far been beyond our reach for practical or ethical reasons, such as early embryonic development. Taking advantage of these new possibilities, we performed a high-resolution eQTL study on 80 inbred fruit fly lines from the Drosophila Genetic Reference Panel, which represent naturally occurring genetic variation in a wild population of Drosophila melanogaster. Using a 3′ Tag RNA-sequencing protocol we were able to estimate the level of expression both of genes as well as of different 3′ isoforms of the same gene. We estimated these expression levels for each line at three different stages of embryonic development, allowing us to not only improve our understanding of D. melanogaster gene regulation in general, but also investigate how gene regulation changes during development. In this thesis, I describe the processing of 3′ Tag-Seq data into both 3′ isoform expression levels and overall gene expression levels. Using these expression levels I call proximal eQTLs both common and specific to a single developmental stage with a multivariate linear mixed model approach while accounting for various confounding factors. I then investigate the properties of these eQTLs, such as their location or the gene categories enriched or depleted in eQTLs. Finally, I extend the proximal eQTL calling approach to distal variants to find gene regulatory mechanisms acting in trans. Taken together, this thesis describes the design, challenges and results of performing a multivariate eQTL study in a higher model organism and provides new insights into gene regulation in D. melanogaster during embryonic development.