Home
Scholarly Works
Online Nearest Neighbor Search in Binary Space
Conference

Online Nearest Neighbor Search in Binary Space

Abstract

We revisit the $K$ Nearest Neighbors (KNN) problem in large binary datasets which is of major importance in several applied areas. The goal is to find the $K$ nearest items in a dataset to a query point where both the query and the items lie in the Hamming cube. The problem is addressed in its online setting, that is, data items are inserted sequentially into the dataset. To accommodate efficient similarity search and fast insertion of new items, we propose a data structure that partitions the feature space based on the Hamming weights of the binary codes and their substrings. Empirical evaluations on a large-scale dataset show significant speedup over the linear scan baseline.

Authors

Eghbali S; Ashtiani H; Tahvildari L

Pagination

pp. 853-858

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

November 1, 2017

DOI

10.1109/icdm.2017.104

Name of conference

2017 IEEE International Conference on Data Mining (ICDM)
View published work (Non-McMaster Users)

Contact the Experts team