Research data supporting "CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data"
Fidaner, Isık Barıs
Cemgil, Ali Taylan
Oliver, Stephen G.
University of Cambridge
MetadataShow full item record
Fidaner, I. B., Cankorur-Cetinkaya, A., Dikicioglu, D., Kirdar, B., Cemgil, A. T., & Oliver, S. G. (2015). Research data supporting "CLUSTERnGO: a user-defined modelling platform for two-stage clustering of time-series data" [Dataset]. https://www.repository.cam.ac.uk/handle/1810/252482
Simple bioinformatic tools are frequently used to analyse time-series datasets regardless of their ability to deal with transient phenomena, limiting the meaningful information that may be extracted from them. This situation requires the development and exploitation of tailor-made, easy-to-use and flexible tools designed specifically for the analysis of time-series datasets. We present a novel statistical application called CLUSTERnGO, which uses a model-based clustering algorithm that fulfils this need. This algorithm involves two components of operation. Component 1 constructs a Bayesian non-parametric model (Infinite Mixture of Piecewise Linear Sequences) and Component 2, which applies a novel clustering methodology (Two-Stage Clustering). The software can also assign biological meaning to the identified clusters using an appropriate ontology. It applies multiple hypothesis testing to report the significance of these enrichments. The algorithm has a four-phase pipeline. The application can be executed using either command-line tools or a user-friendly Graphical User Interface. The C++ and QT source codes, the GUI applications for Windows, OS X and Linux operating systems and user manual are freely available for download under the GNU GPL v3 license in the compressed folder.
This record supports publication and is available at http://bioinformatics.oxfordjournals.org/content/early/2015/09/30/bioinformatics.btv532.long
The compressed folder (.zip extension) includes the publication and the user manual for the software in .pdf format. The source codes (src) and the executable files (app) for Windows-, Linux- and MacOS-based operating systems are included in separate compressed folders with .zip extension.
time-series data analysis, model-based clustering, Infinite Mixture of Piecewise Linear Sequences, Bayesian non-parametric models
Publication Reference: https://doi.org/10.1093/bioinformatics/btv532
This work was supported by the Biotechnology and Biological Sciences Research Council [BRIC2.2 grant BB/K011138/1 to S.G.O.], EU 7th Framework Programme [BIOLEDGE Contract No: 289126 to S.G.O.] and GNU GPLv3.
This record's URL: https://www.repository.cam.ac.uk/handle/1810/252482
This record is licensed under a GNU GPLv3 licence.