Generation of a Novel SARS-CoV-2 Sub-genomic RNA Due to the R203K/G204R Variant in Nucleocapsid: Homologous Recombination has Potential to Change SARS-CoV-2 at Both Protein and RNA Level.

No Thumbnail Available
Change log
Leary, Shay 
Gaudieri, Silvana 
Parker, Matthew D 
Chopra, Abha 
James, Ian 

BACKGROUND: Genetic variations across the SARS-CoV-2 genome may influence transmissibility of the virus and the host's anti-viral immune response, in turn affecting the frequency of variants over time. In this study, we examined the adjacent amino acid polymorphisms in the nucleocapsid (R203K/G204R) of SARS-CoV-2 that arose on the background of the spike D614G change and describe how strains harboring these changes became dominant circulating strains globally. METHODS: Deep-sequencing data of SARS-CoV-2 from public databases and from clinical samples were analyzed to identify and map genetic variants and sub-genomic RNA transcripts across the genome. Results: Sequence analysis suggests that the 3 adjacent nucleotide changes that result in the K203/R204 variant have arisen by homologous recombination from the core sequence of the leader transcription-regulating sequence (TRS) rather than by stepwise mutation. The resulting sequence changes generate a novel sub-genomic RNA transcript for the C-terminal dimerization domain of nucleocapsid. Deep-sequencing data from 981 clinical samples confirmed the presence of the novel TRS-CS-dimerization domain RNA in individuals with the K203/R204 variant. Quantification of sub-genomic RNA indicates that viruses with the K203/R204 variant may also have increased expression of sub-genomic RNA from other open reading frames. CONCLUSIONS: The finding that homologous recombination from the TRS may have occurred since the introduction of SARS-CoV-2 in humans, resulting in both coding changes and novel sub-genomic RNA transcripts, suggests this as a mechanism for diversification and adaptation within its new host.

COVID-19, SARS-CoV-2, homologous recombination, sub-genomic RNA transcript, transcription-regulating sequence, viral polymorphism
Journal Title
Pathog Immun
Conference Name
Journal ISSN
Volume Title
Case Western Reserve University
MRC (MC_PC_19027)
Medical Research Council (MC_PC_19027)
SG, SL and EA were supported by a grant awarded by the National Health and Medical Research Council (NHMRC; APP1148284). SM was supported by a National Institutes of Health (NI-H)-funded Tennessee Center for AIDS Research (P30 AI110527). MDP was funded by the NIHR Sheffield Biomedical Research Centre (BRC - IS-BRC-1215-20017). Sequencing of SARS-CoV-2 samples was undertaken by the Sheffield COVID-19 Genomics Group as part of the COG-UK CONSORTIUM. COG-UK and supported by funding from the Medical Research Council (MRC) part of UK Research & Innovation (UKRI), the National Institute of Health Research (NIHR) and Genome Research Limited, operating as the Wellcome Sanger Institute. TIdS is supported by a Wellcome Trust Intermediate Clinical Fellowship (110058/Z/15/Z).