Home
Scholarly Works
An Ensemble Model for Combating Label Noise
Conference

An Ensemble Model for Combating Label Noise

Abstract

The labels crawled from web services (e.g. querying images from search engines and collecting tags from social media images) are often prone to noise, and the presence of such label noise degrades the classification performance of the resulting deep neural network (DNN) models. In this paper, we propose an ensemble model consisting of two networks to prevent the model from memorizing noisy labels. Within our model, we have one network generate an anchoring label from its prediction on a weakly-augmented image. Meanwhile, we force its peer network, taking the strongly-augmented version of the same image as input, to generate prediction close to the anchoring label for knowledge distillation. By observing the loss distribution, we use a mixture model to dynamically estimate the clean probability of each training sample and generate a confidence clean set. Then we train both networks simultaneously by the clean set to minimize our loss function which contains unsupervised matching loss (i.e., measure the consistency of the two networks) and supervised classification loss (i.e. measure the classification performance). We theoretically analyze the gradient of our loss function to show that it implicitly prevents memorization of the wrong labels. Experiments on two simulated benchmarks and one real-world dataset demonstrate that our approach achieves substantial improvements over the state-of-the-art methods.

Authors

Lu Y; Bo Y; He W

Pagination

pp. 608-617

Publisher

Association for Computing Machinery (ACM)

Publication Date

February 11, 2022

DOI

10.1145/3488560.3498376

Name of conference

Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining
View published work (Non-McMaster Users)

Contact the Experts team