Alevin efficiently estimates accurate gene abundances from dscRNA-seq data.
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Srivastava, Avi
Malik, Laraib
Smith, Tom
Sudbery, Ian
Patro, Rob https://orcid.org/0000-0001-8463-1675
Abstract
We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin's approach to UMI deduplication considers transcript-level constraints on the molecules from which UMIs may have arisen and accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads and improves the accuracy of gene abundance estimates. Alevin is considerably faster, typically eight times, than existing gene quantification approaches, while also using less memory.
Description
Keywords
Cellular barcode, Quantification, Single-cell RNA-seq, UMI deduplication, Animals, DNA Barcoding, Taxonomic, Humans, Mice, Sequence Analysis, RNA, Single-Cell Analysis, Software
Journal Title
Genome Biol
Conference Name
Journal ISSN
1474-7596
1474-760X
1474-760X
Volume Title
20
Publisher
Springer Science and Business Media LLC