Mining Diverse and Novel RNA Viruses in Transcriptomic Datasets
Repository URI
Repository DOI
Change log
Authors
Abstract
In my project, I analysed multiple various transcriptomic data with a goal to identify novel and diverse RNA viruses. For this, I developed a workflow and applied various methods to accommodate and characterize the (un)expected viral diversity. I analysed multiple RNA-sequencing datasets of ants and I searched for viruses in various National Center for Biotechnology Information (NCBI) databases. The analysed NCBI databases are: viruses (non redundant) nucleotide database (nt/nt GenBank) and Transcriptome Shotgun Assembly (TSA) database. With this approach, data generated for one goal was analysed with a completely different purpose. The search was performed using multiple methods ( de-novo assemblies, sequence-sequence (BLAST) or sequence-profile (HMMER) comparisons). For the HMMER tools based search, I created and manually curated profile Hidden Markov Models (pHMMs) for viral RNA dependent RNA polymerase, which was the main protein of interest of this project. The pHMMs represented various viral families as well as genera. The analysis resulted in finding a novel polycistronic picorna-like RNA virus family: Polycipiviridae . This is a unique genome organisation having arthropods infecting RNA viruses. My and my colleagues hypothesise these viruses employ novel molecular mechanisms to express their structural (re-initiation) and replication (possibly novel IRES) proteins. The pHMMs-based search was evaluated, resulting in a selection of various thresholds and insights into current viral diversity and taxonomy. Finally, using this optimized pHMMs-based search, I identified over 15,000 viral RdRp-encoding sequences. A downstream analysis of these sequences resulted in better explanation of taxonomic relationships between various RNA virus groups, helped to improve knowledge of RNA dependent RNA polymerase diversity and expanded current understanding of host specificity, as well as uncovered novel molecular mechanisms of divergent and novel RNA viruses.