Translation-Mediated Stress Responses: Mining of Ribosome Profiling Data
University of Cambridge
Doctor of Philosophy (PhD)
MetadataShow full item record
Franaszek, K. (2017). Translation-Mediated Stress Responses: Mining of Ribosome Profiling Data (Doctoral thesis). https://doi.org/10.17863/CAM.15702
Advances in next-generation sequencing platforms during the past decade have resulted in exponential increases in biological data generation. Besides applications in determining the sequences of genomes and other DNA elements, these platforms have allowed the characterization of cell-wide mRNA pools under different conditions and in different tissues. In 2009, Ingolia and colleagues developed an extension of high-throughput sequencing that provides a snapshot of all cellular mRNA fragments protected by translating ribosomes, dubbed ribosome profiling. This approach allows detection of differential translation activity, annotation of novel protein coding sequences and variants, identification of ribosome pause sites and estimates of de novo protein synthesis. As with other sequencing based methodologies, a major challenge of ribosome profiling has been sorting, filtering and interpreting the gigabytes of data produced during the course of a typical experiment. In this thesis, I developed and applied computational pipelines to interrogate ribosome profiling data in relation to gene expression in several viruses and eukaryotic species, as well as to identify sites of ribosomal pausing and sites of non-canonical translation activity. Specifically, I applied various control analyses for characterizing the quality of profiling data and developed scripts for visualizing genome-based (exon-by-exon) rather than transcript-based ribosome footprint alignments. I also examined the challenge of mapping footprints to repetitive sequences in the genome and propose ways to mitigate the associated problems. I performed differential expression analyses on data from coronavirus-infected murine cells, retrovirus-infected human cells and temperature-stressed Arabidopsis thaliana plants. Dissection of translational responses in Arabidopsis thaliana during heat shock or cold shock revealed several groups of genes that were highly upregulated within 10 minutes of temperature challenge. Analysis of the branches of the unfolded protein and integrated stress responses during coronavirus infection allowed for deconvolution of transcriptional and translational contributions. During the course of these analyses, I identified errors in a recently publicized algorithm for detection of differential translation, and wrote corrections that have now been pulled into the repository for this package. Comparison of the translational kinetics of the dengue virus infection in mosquito and human cell lines revealed host-specific sites of ribosome pausing and RNA accumulation. Analysis of HIV profiling data revealed footprint peaks which were in agreement with previously proposed models of peptide or RNA mediated ribosome stalling. I also developed a simulation to identify transcripts that are prone to generating RPFs with multiple alignments during the read mapping process. Together, the scripts and pipelines developed during the course of this work will serve to expedite future analyses of ribosome profiling data, and the results will inform future studies of several important pathogens and temperature stress in plants.
Bioinformatics, Computational Biology, Ribosome Profiling, Virology, MHV, DENV, HIV, Heat shock, Cold shock, RNASeq, RiboSeq, Coronavirus, Flavivirus, Retrovirus, Translation, Ribosome, Translational Recoding
This record's DOI: https://doi.org/10.17863/CAM.15702