Show simple item record

dc.contributor.authorTumescheit, Charlotte
dc.contributor.authorFirth, Andrew
dc.contributor.authorBrown, Katherine
dc.date.accessioned2022-04-21T01:03:28Z
dc.date.available2022-04-21T01:03:28Z
dc.date.issued2022
dc.identifier.issn2167-8359
dc.identifier.other35310163
dc.identifier.otherPMC8932311
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/336297
dc.description.abstractBackground: Throughout biology, multiple sequence alignments (MSAs) form the basis of much investigation into biological features and relationships. These alignments are at the heart of many bioinformatics analyses. However, sequences in MSAs are often incomplete or very divergent, which can lead to poor alignment and large gaps. This slows down computation and can impact conclusions without being biologically relevant. Cleaning the alignment by removing common issues such as gaps, divergent sequences, large insertions and deletions and poorly aligned sequence ends can substantially improve analyses. Manual editing of MSAs is very widespread but is time-consuming and difficult to reproduce. Results: We present a comprehensive, user-friendly MSA trimming tool with multiple visualisation options. Our highly customisable command line tool aims to give intervention power to the user by offering various options, and outputs graphical representations of the alignment before and after processing to give the user a clear overview of what has been removed. The main functionalities of the tool include removing regions of low coverage due to insertions, removing gaps, cropping poorly aligned sequence ends and removing sequences that are too divergent or too short. The thresholds for each function can be specified by the user and parameters can be adjusted to each individual MSA. CIAlign is designed with an emphasis on solving specific and common alignment problems and on providing transparency to the user. Conclusion: CIAlign effectively removes problematic regions and sequences from MSAs and provides novel visualisation options. This tool can be used to fine-tune alignments for further analysis and processing. The tool is aimed at anyone who wishes to automatically clean up parts of an MSA and those requiring a new, accessible way of visualising large MSAs.
dc.languageeng
dc.publisherPeerJ
dc.rightsAttribution 4.0 International
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.sourceessn: 2167-8359
dc.sourcenlmid: 101603425
dc.subjectComparative genomics
dc.subjectMultiple sequence alignment
dc.subjectphylogenetics
dc.subjectTranscriptomics
dc.subjectAlignment Quality
dc.subjectPython Tool
dc.titleCIAlign: A highly customisable command line tool to clean, interpret and visualise multiple sequence alignments.
dc.typeArticle
dc.date.updated2022-04-21T01:03:28Z
prism.publicationNamePeerJ
prism.volume10
dc.identifier.doi10.17863/CAM.83715
dcterms.dateAccepted2022-02-01
rioxxterms.versionofrecord10.7717/peerj.12983
rioxxterms.versionVoR
rioxxterms.licenseref.urihttps://creativecommons.org/licenses/by/4.0/
dc.contributor.orcidFirth, Andrew [0000-0002-7986-9520]
dc.contributor.orcidBrown, Katherine [0000-0002-8400-6922]
dc.identifier.eissn2167-8359
pubs.funder-project-idWellcome Trust (106207/Z/14/Z)
pubs.funder-project-idEuropean Research Council (646891)
cam.issuedOnline2022-03-15


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Attribution 4.0 International
Except where otherwise noted, this item's licence is described as Attribution 4.0 International