Show simple item record

dc.contributor.authorJänes, Jürgenen
dc.contributor.authorHu, Fengyuanen
dc.contributor.authorLewin, Alexandraen
dc.contributor.authorTurro Bassols, Ernesten
dc.date.accessioned2015-04-22T15:45:07Z
dc.date.available2015-04-22T15:45:07Z
dc.date.issued2015-03-18en
dc.identifier.citationBriefings in Bioinformatics 2015, 16(6): 932-940. doi:10.1093/bib/bbv007en
dc.identifier.issn1467-5463
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/247446
dc.description.abstractThree principal approaches have been proposed for inferring the set of transcripts expressed in RNA samples using RNA-seq. The simplest approach uses curated annotations, which assumes the transcripts in a sample are a subset of the transcripts listed in a curated database. A more ambitious method involves aligning reads to a reference genome and using the alignments to infer the transcript structures, possibly with the aid of a curated transcript database. The most challenging approach is to assemble reads into putative transcripts de novo without the aid of reference data. We have systematically assessed the properties of these three approaches through a simulation study. We have found that the sensitivity of computational transcript set estimation is severely limited. Computational approaches (both genome-guided and de novo assembly) produce a large number of artefacts, which are assigned large expression estimates and absorb a substantial proportion of the signal when performing expression analysis. The approach using curated annotations shows good expression correlation even when the annotations are incomplete. Furthermore, any incorrect transcripts present in a curated set do not absorb much signal, so it is preferable to have a curation set with high sensitivity than high precision. Software to simulate transcript sets, expression values and sequence reads under a wider range of parameter values and to compare sensitivity, precision and signal-to-noise ratios of different methods is freely available online (https://github.com/boboppie/RSSS) and can be expanded by interested parties to include methods other than the exemplars presented in this article.
dc.description.sponsorshipThis work was supported by the Wellcome Trust (WT097679); the Cambridge Biomedical Research Centre; Cancer Research UK (C14303/A10825) and the Medical Research Council (G1002319).
dc.languageEnglishen
dc.language.isoenen
dc.publisherOxford Journals
dc.rightsAttribution 2.0 UK: England & Wales*
dc.rights.urihttp://creativecommons.org/licenses/by/2.0/uk/*
dc.titleA comparative study of RNA-seq analysis strategiesen
dc.typeArticle
dc.description.versionThis is the final version of the article. It was first available from Oxford University Press via http://dx.doi.org/10.1093/bib/bbv007en
prism.endingPage940
prism.publicationDate2015en
prism.publicationNameBriefings in Bioinformaticsen
prism.startingPage932
prism.volume16en
dc.rioxxterms.funderWellcome Trust
dc.rioxxterms.funderCRUK
dc.rioxxterms.funderMRC
dc.rioxxterms.projectidWT097679
dc.rioxxterms.projectidC14303/A10825
dc.rioxxterms.projectidG1002319
rioxxterms.versionofrecord10.1093/bib/bbv007en
rioxxterms.licenseref.urihttp://www.rioxx.net/licenses/all-rights-reserveden
rioxxterms.licenseref.startdate2015-03-18en
dc.contributor.orcidTurro Bassols, Ernest [0000-0002-1820-6563]
dc.identifier.eissn1477-4054
rioxxterms.typeJournal Article/Reviewen
pubs.funder-project-idWellcome Trust (097679/Z/11/Z)
pubs.funder-project-idCancer Research UK (C14303/A10825)


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution 2.0 UK: England & Wales
Except where otherwise noted, this item's licence is described as Attribution 2.0 UK: England & Wales