Home
Scholarly Works
Finding theme communities from database networks
Journal article

Finding theme communities from database networks

Abstract

Given a database network where each vertex is associated with a transaction database, we are interested in finding theme communities. Here, a theme community is a cohesive subgraph such that a common pattern is frequent in all transaction databases associated with the vertices in the subgraph. Finding all theme communities from a database network enjoys many novel applications. However, it is challenging since even counting the number of all theme communities in a database network is #P-hard. Inspired by the observation that a theme community shrinks when the length of the pattern increases, we investigate several properties of theme communities and develop TCFI, a scalable algorithm that uses these properties to effectively prune the patterns that cannot form any theme community. We also design TC-Tree, a scalable algorithm that decomposes and indexes theme communities efficiently. Retrieving a ranked list of theme communities from a TC-Tree of hundreds of millions of theme communities takes less than 1 second. Extensive experiments and a case study demonstrate the effectiveness and scalability of TCFI and TC-Tree in discovering and querying meaningful theme communities from large database networks.

Authors

Chu L; Wang Z; Pei J; Zhang Y; Yang Y; Chen E

Journal

Proceedings of the VLDB Endowment, Vol. 12, No. 10, pp. 1071–1084

Publisher

Association for Computing Machinery (ACM)

Publication Date

January 1, 2019

DOI

10.14778/3339490.3339492

ISSN

2150-8097

Contact the Experts team