Repository logo
 

Mining‌ ‌Diverse‌ ‌and‌ ‌Novel‌ ‌RNA‌ ‌Viruses‌‌ in‌ ‌Transcriptomic‌ ‌Datasets‌ ‌


Loading...
Thumbnail Image

Type

Change log

Authors

Olendraite, Ingrida  ORCID logo  https://orcid.org/0000-0002-6209-2233

Abstract

In‌ ‌my‌ ‌project,‌ ‌I‌ ‌analysed‌ ‌multiple‌ ‌various‌ ‌transcriptomic‌ ‌data‌ ‌with‌ ‌a‌ ‌goal‌ ‌to‌ ‌identify‌ ‌novel‌‌ and‌ ‌diverse‌ ‌RNA‌ ‌viruses.‌ ‌For‌ ‌this,‌ ‌I‌ ‌developed‌ ‌a‌ ‌workflow‌ ‌and‌ ‌applied‌ ‌various‌ ‌methods‌ ‌to‌‌ accommodate‌ ‌and‌ ‌characterize‌ ‌the‌ ‌(un)expected‌ ‌viral‌ ‌diversity.‌ ‌ I‌ ‌analysed‌ ‌multiple‌ ‌RNA-sequencing‌ ‌datasets‌ ‌of‌ ‌ants‌ ‌and‌ ‌I‌ ‌searched‌ ‌for‌ ‌viruses‌ ‌in‌ ‌various‌‌ National‌ ‌Center‌ ‌for‌ ‌Biotechnology‌ ‌Information‌ ‌(NCBI)‌ ‌databases.‌ ‌The‌ ‌analysed‌ ‌NCBI‌‌ databases‌ ‌are:‌ ‌viruses‌ ‌(non‌ ‌redundant)‌ ‌nucleotide‌ ‌database‌ ‌(nt/nt‌ ‌GenBank)‌ ‌and‌‌ Transcriptome‌ ‌Shotgun‌ ‌Assembly‌ ‌(TSA)‌ ‌database.‌ ‌With‌ ‌this‌ ‌approach,‌ ‌data‌ ‌generated‌ ‌for‌ one‌ ‌goal‌ ‌was‌ ‌analysed‌ ‌with‌ ‌a‌ ‌completely‌ ‌different‌ ‌purpose.‌ ‌The‌ ‌search‌ ‌was‌ ‌performed‌‌ using‌ ‌multiple‌ ‌methods‌ ‌( de-novo‌‌ ‌assemblies,‌ ‌sequence-sequence‌ ‌(BLAST)‌ ‌or‌‌ sequence-profile‌ ‌(HMMER)‌ ‌comparisons).‌ ‌For‌ ‌the‌ ‌HMMER‌ ‌tools‌ ‌based‌ ‌search,‌ ‌I‌ ‌created‌‌ and‌ ‌manually‌ ‌curated‌ ‌profile‌ ‌Hidden‌ ‌Markov‌ ‌Models‌ ‌(pHMMs)‌ ‌for‌ ‌viral‌ ‌RNA‌ ‌dependent‌‌ RNA‌ ‌polymerase,‌ ‌which‌ ‌was‌ ‌the‌ ‌main‌ ‌protein‌ ‌of‌ ‌interest‌ ‌of‌ ‌this‌ ‌project.‌ ‌The‌ ‌pHMMs‌‌ represented‌ ‌various‌ ‌viral‌ ‌families‌ ‌as‌ ‌well‌ ‌as‌ ‌genera.‌ ‌ The‌ ‌analysis‌ ‌resulted‌ ‌in‌ ‌finding‌ ‌a‌ ‌novel‌ ‌polycistronic‌ ‌picorna-like‌ ‌RNA‌ ‌virus‌ ‌family:‌‌ Polycipiviridae‌ .‌ ‌This‌ ‌is‌ ‌a‌ ‌unique‌ ‌genome‌ ‌organisation‌ ‌having‌ ‌arthropods‌ ‌infecting‌ ‌RNA‌‌ viruses.‌ ‌My‌ ‌and‌ ‌my‌ ‌colleagues‌ ‌hypothesise‌ ‌these‌ ‌viruses‌ ‌employ‌ ‌novel‌ ‌molecular‌‌ mechanisms‌ ‌to‌ ‌express‌ ‌their‌ ‌structural‌ ‌(re-initiation)‌ ‌and‌ ‌replication‌ ‌(possibly‌ ‌novel‌ ‌IRES)‌‌ proteins.‌ ‌ The‌ ‌pHMMs-based‌ ‌search‌ ‌was‌ ‌evaluated,‌ ‌resulting‌ ‌in‌ ‌a‌ ‌selection‌ ‌of‌ ‌various‌ ‌thresholds‌‌ and‌ ‌insights‌ ‌into‌ ‌current‌ ‌viral‌ ‌diversity‌ ‌and‌ ‌taxonomy.‌ ‌Finally,‌ ‌using‌ ‌this‌ ‌optimized‌‌ pHMMs-based‌ ‌search,‌ ‌I‌ ‌identified‌ ‌over‌ ‌15,000‌ ‌viral‌ ‌RdRp-encoding‌ ‌sequences.‌ ‌A‌‌ downstream‌ ‌analysis‌ ‌of‌ ‌these‌ ‌sequences‌ ‌resulted‌ ‌in‌ ‌better‌ ‌explanation‌ ‌of‌ ‌taxonomic‌‌ relationships‌ ‌between‌ ‌various‌ ‌RNA‌ ‌virus‌ ‌groups,‌ ‌helped‌ ‌to‌ ‌improve‌ ‌knowledge‌ ‌of‌ ‌RNA‌‌ dependent‌ ‌RNA‌ ‌polymerase‌ ‌diversity‌ ‌and‌ ‌expanded‌ ‌current‌ ‌understanding‌ ‌of‌ ‌host‌‌ specificity,‌ ‌as‌ ‌well‌ ‌as‌ ‌uncovered‌ ‌novel‌ ‌molecular‌ ‌mechanisms‌ ‌of‌ ‌divergent‌ ‌and‌ ‌novel‌ ‌RNA‌‌ viruses.‌ ‌

Description

Date

2020-12-31

Advisors

Firth, Andrew

Qualification

Doctor of Philosophy (PhD)

Awarding Institution

University of Cambridge

Rights and licensing

Except where otherwised noted, this item's license is described as All Rights Reserved
Sponsorship
European Research Council grant [646891] to Andrew Firth