Repository logo
 

Discovering maximal motif cliques in large heterogeneous information networks

Accepted version
Peer-reviewed

Type

Article

Change log

Authors

Hu, J 
Cheng, R 
Chang, KCC 
Sankar, A 
Fang, Y 

Abstract

We study the discovery of cliques (or "complete" subgraphs) in heterogeneous information networks (HINs). Existing clique-finding solutions often ignore the rich semantics of HINs. We propose motif clique, or m-clique, which redefines subgraphs completeness with respect to a given motif. A motif essentially a small subgraph pattern, is a fundamental building block of an HIN. The m-clique concept is general and allows us to analyse "complete" subgraphs in an HIN with respect to desired high-order connection patterns. We further investigate the maximal m-clique enumeration problem (MMCE), which finds all maximal m-cliques not contained in any other m-cliques. Because MMCE is NP-hard, developing an accurate and efficient solution for MMCE is not straightforward. we thus present the META algorithm, which employs advanced pruning strategies to effectively reduce the search space. We also design fast techniques to avoid generating duplicated maximal m-clique instances. Our extensive experiments on large real and synthetic HINs how that META is highly effective and efficient.

Description

Keywords

46 Information and Computing Sciences, 4904 Pure Mathematics, 49 Mathematical Sciences

Journal Title

Proceedings - International Conference on Data Engineering

Conference Name

Journal ISSN

1084-4627

Volume Title

2019-April

Publisher

IEEE
Sponsorship
Wellcome Trust (083650/Z/07/Z)
Wellcome Trust (100574/Z/12/Z)
Medical Research Council (MC_UU_12012/1)
Medical Research Council (MC_UU_12012/5)
Medical Research Council (MC_PC_12012)