Show simple item record

dc.contributor.authorBenmounah, Z
dc.contributor.authorMeshoul, S
dc.contributor.authorBatouche, M
dc.contributor.authorLio, Pietro
dc.date.accessioned2018-09-13T09:18:26Z
dc.date.available2018-09-13T09:18:26Z
dc.date.issued2018-08
dc.identifier.issn1568-4946
dc.identifier.urihttps://www.repository.cam.ac.uk/handle/1810/280255
dc.description.abstractClustering is an important technique for data analysis and knowledge discovery. In the context of big data, it becomes a challenging issue due to the huge amount of data recently collected making conventional clustering algorithms inappropriate. The use of swarm intelligence algorithms has shown promising results when applied to data clustering of moderate size due to their decentralized and self-organized behavior. However, these algorithms exhibit limited capabilities when large data sets are involved. In this paper, we developed a decentralized distributed big data clustering solution using three swarm intelligence algorithms according to MapReduce framework. The developed framework allows cooperation between the three algorithms namely particle swarm optimization, ant colony optimization and artificial bees colony to achieve largely scalable data partitioning through a migration strategy. This latter reaps advantage of the combined exploration and exploitation capabilities of these algorithms to foster diversity. The framework is tested using amazon elastic map-reduce service (EMR) deploying up to 192 computer nodes and 30 gigabytes of data. Parallel metrics such as speed-up, size-up and scale-up are used to measure the elasticity and scalability of the framework. Our results are compared with their counterparts big data clustering results and show a significant improvement in terms of time and convergence to good quality solution. The developed model has been applied to epigenetics data clustering according to methylation features in CpG islands, gene body, and gene promoter in order to study the epigenetics impact on aging. Experimental results reveal that DNA-methylation changes slightly and not aberrantly with aging corroborating previous studies.
dc.publisherElsevier
dc.subjectbig data
dc.subjectswarm intelligence
dc.subjectlarge-scale clustering
dc.subjectmap-reduce
dc.subjectepigenetics
dc.subjectaging
dc.titleParallel swarm intelligence strategies for large-scale clustering based on MapReduce with application to epigenetics of aging
dc.typeArticle
prism.endingPage783
prism.publicationNameApplied Soft Computing
prism.startingPage771
prism.volume69
dc.identifier.doi10.17863/CAM.27624
dcterms.dateAccepted2018-04-10
rioxxterms.versionofrecord10.1016/j.asoc.2018.04.012
rioxxterms.versionAM
rioxxterms.licenseref.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/
rioxxterms.licenseref.startdate2018-04-10
dc.contributor.orcidLio, Pietro [0000-0002-0540-5053]
dc.identifier.eissn1872-9681
dc.publisher.urlhttps://www.sciencedirect.com/science/article/pii/S1568494618302035?via=ihub#!
rioxxterms.typeJournal Article/Review
cam.issuedOnline2018-04-27
dc.identifier.urlhttps://www.sciencedirect.com/science/article/pii/S1568494618302035?via=ihub#!
rioxxterms.freetoread.startdate2019-04-27


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record