RECOME: a New Density-Based Clustering Algorithm Using Relative KNN Kernel Density
Abstract
Discovering clusters from a dataset with different shapes, densities, and
scales is a known challenging problem in data clustering. In this paper, we
propose the RElative COre MErge (RECOME) clustering algorithm. The core of
RECOME is a novel density measure, i.e., Relative $K$ nearest Neighbor Kernel
Density (RNKD). RECOME identifies core objects with unit RNKD, and {partitions}
non-core objects into atom clusters by successively following higher-density
neighbor relations toward core objects. Core objects and their corresponding
atom clusters are then merged through $\alpha$-reachable paths on a KNN graph.
We discover that the number of clusters computed by RECOME is a step function
of the $\alpha$ parameter with jump discontinuity on a small collection of
values. A fast jump discontinuity discovery (FJDD) method is proposed based on
graph theory. RECOME is evaluated on both synthetic datasets and real datasets.
Experimental results indicate that RECOME is able to discover clusters with
different shapes, densities, and scales. It outperforms six baseline methods on
both synthetic datasets and real datasets. Moreover, FJDD is shown to be
effective to extract the jump discontinuity set of parameter $\alpha$ for all
tested datasets, which can ease the task of data exploration and parameter
tuning.