Repository logo
 

Alevin efficiently estimates accurate gene abundances from dscRNA-seq data.

Published version
Peer-reviewed

Type

Article

Change log

Authors

Srivastava, Avi 
Malik, Laraib 
Smith, Tom 
Sudbery, Ian 

Abstract

We introduce alevin, a fast end-to-end pipeline to process droplet-based single-cell RNA sequencing data, performing cell barcode detection, read mapping, unique molecular identifier (UMI) deduplication, gene count estimation, and cell barcode whitelisting. Alevin's approach to UMI deduplication considers transcript-level constraints on the molecules from which UMIs may have arisen and accounts for both gene-unique reads and reads that multimap between genes. This addresses the inherent bias in existing tools which discard gene-ambiguous reads and improves the accuracy of gene abundance estimates. Alevin is considerably faster, typically eight times, than existing gene quantification approaches, while also using less memory.

Description

Keywords

Cellular barcode, Quantification, Single-cell RNA-seq, UMI deduplication, Animals, DNA Barcoding, Taxonomic, Humans, Mice, Sequence Analysis, RNA, Single-Cell Analysis, Software

Journal Title

Genome Biol

Conference Name

Journal ISSN

1474-7596
1474-760X

Volume Title

20

Publisher

Springer Science and Business Media LLC