Efficient iterative Hi-C scaffolder based on N-best neighbors.
Publication Date
2021-11-27Journal Title
BMC Bioinformatics
ISSN
1471-2105
Publisher
Springer Science and Business Media LLC
Volume
22
Issue
1
Language
en
Type
Article
This Version
VoR
Metadata
Show full item recordCitation
Guan, D., McCarthy, S., Ning, Z., Wang, G., Wang, Y., & Durbin, R. (2021). Efficient iterative Hi-C scaffolder based on N-best neighbors.. BMC Bioinformatics, 22 (1) https://doi.org/10.1186/s12859-021-04453-5
Abstract
BACKGROUND: Efficient and effective genome scaffolding tools are still in high demand for generating reference-quality assemblies. While long read data itself is unlikely to create a chromosome-scale assembly for most eukaryotic species, the inexpensive Hi-C sequencing technology, capable of capturing the chromosomal profile of a genome, is now widely used to complete the task. However, the existing Hi-C based scaffolding tools either require a priori chromosome number as input, or lack the ability to build highly continuous scaffolds. RESULTS: We design and develop a novel Hi-C based scaffolding tool, pin_hic, which takes advantage of contact information from Hi-C reads to construct a scaffolding graph iteratively based on N-best neighbors of contigs. Subsequent to scaffolding, it identifies potential misjoins and breaks them to keep the scaffolding accuracy. Through our tests on three long read based de novo assemblies from three different species, we demonstrate that pin_hic is more efficient than current standard state-of-art tools, and it can generate much more continuous scaffolds, while achieving a higher or comparable accuracy. CONCLUSIONS: Pin_hic is an efficient Hi-C based scaffolding tool, which can be useful for building chromosome-scale assemblies. As many sequencing projects have been launched in the recent years, we believe pin_hic has potential to be applied in these projects and makes a meaningful contribution.
Keywords
Software, Hi-C, Scaffolding
Sponsorship
National Natural Science Foundation of China (2017YFC0907503, 2018YFC0910504, 2017YFC1201201)
Wellcome Trust (WT207492)
Identifiers
s12859-021-04453-5, 4453
External DOI: https://doi.org/10.1186/s12859-021-04453-5
This record's URL: https://www.repository.cam.ac.uk/handle/1810/331446
Rights
Licence:
http://creativecommons.org/licenses/by/4.0/
Statistics
Total file downloads (since January 2020). For more information on metrics see the
IRUS guide.
Recommended or similar items
The current recommendation prototype on the Apollo Repository will be turned off on 03 February 2023. Although the pilot has been fruitful for both parties, the service provider IKVA is focusing on horizon scanning products and so the recommender service can no longer be supported. We recognise the importance of recommender services in supporting research discovery and are evaluating offerings from other service providers. If you would like to offer feedback on this decision please contact us on: support@repository.cam.ac.uk