Repository logo
 

SecretSanta: flexible pipelines for functional secretome prediction.

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Gogleva, Anna 
Schornack, Sebastian  ORCID logo  https://orcid.org/0000-0002-7836-5881

Abstract

MOTIVATION: The secretome denotes the collection of secreted proteins exported outside of the cell. The functional roles of secreted proteins include the maintenance and remodelling of the extracellular matrix as well as signalling between host and non-host cells. These features make secretomes rich reservoirs of biomarkers for disease classification and host-pathogen interaction studies. Common biomarkers are extracellular proteins secreted via classical pathways that can be predicted from sequence by annotating the presence or absence of N-terminal signal peptides. Several heterogeneous command line tools and web-interfaces exist to identify individual motifs, signal sequences and domains that are either characteristic or strictly excluded from secreted proteins. However, a single flexible secretome-prediction workflow that combines all analytic steps is still missing. RESULTS: To bridge this gap the SecretSanta package implements wrapper and parser functions around established command line tools for the integrative prediction of extracellular proteins that are secreted via classical pathways. The modularity of SecretSanta enables users to create tailored pipelines and apply them across the whole tree of life to facilitate comparison of secretomes across multiple species or under various conditions. AVAILABILITY AND IMPLEMENTATION: SecretSanta is implemented in the R programming language and is released under GPL-3 license. All functions have been optimized and parallelized to allow large-scale processing of sequences. The open-source code, installation instructions and vignette with use case scenarios can be downloaded from https://github.com/gogleva/SecretSanta. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Description

Keywords

Genomics, Programming Languages, Workflow

Journal Title

Bioinformatics

Conference Name

Journal ISSN

1367-4803
1367-4811

Volume Title

34

Publisher

Oxford University Press (OUP)
Sponsorship
The Royal Society (uf110073)
Gatsby Charitable Foundation (unknown)
European Research Council (637537)