Repository logo
 

Distributed graph clustering and sparsification

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Sun, H 
Zanetti, L 

Abstract

jats:pGraph clustering is a fundamental computational problem with a number of applications in algorithm design, machine learning, data mining, and analysis of social networks. Over the past decades, researchers have proposed a number of algorithmic design methods for graph clustering. Most of these methods, however, are based on complicated spectral techniques or convex optimisation and cannot be directly applied for clustering many networks that occur in practice, whose information is often collected on different sites. Designing a simple and distributed clustering algorithm is of great interest and has comprehensive applications for processing big datasets.</jats:p> jats:pIn this article, we present a simple and distributed algorithm for graph clustering: For a wide class of graphs that are characterised by a strong cluster-structure, our algorithm finishes in a poly-logarithmic number of rounds and recovers a partition of the graph close to optimal. One of the main procedures behind our algorithm is a sampling scheme that, given a dense graph as input, produces a sparse subgraph that provably preserves the cluster-structure of the input. Compared with previous sparsification algorithms that require Laplacian solvers or involve combinatorial constructions, this procedure is easy to implement in a distributed setting and runs fast in practice.</jats:p>

Description

Keywords

46 Information and Computing Sciences, 4602 Artificial Intelligence, Networking and Information Technology R&D (NITRD), Bioengineering

Journal Title

ACM Transactions on Parallel Computing

Conference Name

Journal ISSN

2329-4949
2329-4957

Volume Title

6

Publisher

Association for Computing Machinery (ACM)

Rights

All rights reserved
Sponsorship
European Research Council (679660)