Crumble: reference free lossy compression of sequence quality values
Published version
Peer-reviewed
Repository URI
Repository DOI
Change log
Authors
Abstract
Motivation: The bulk of space taken up by NGS sequencing CRAM files consists of per-base quality values. Most of these are unnecessary for variant calling, offering an opportunity for space saving. Results: On the Syndip test set, a 17 fold reduction in the quality storage portion of a CRAM file can be achieved while maintaining variant calling accuracy. The size reduction of an entire CRAM file varied from 2.2 to 7.4 fold, depending on the non-quality content of the original file (see Supplementary Material S6 for details). Availability and implementation: Crumble is OpenSource and can be obtained from https://github.com/jkbonfield/crumble. Supplementary information: Supplementary data are available at Bioinformatics online.
Description
Journal Title
Bioinformatics
Conference Name
Journal ISSN
1367-4803
1367-4811
1367-4811
Volume Title
35
Publisher
Oxford University Press
Publisher DOI
Rights and licensing
Except where otherwised noted, this item's license is described as Attribution 4.0 International
Sponsorship
Wellcome Trust (unknown)
This work was funded by the Wellcome Trust [WT098051].

