Journal article
Generative Denoise Distillation: Simple stochastic noises induce efficient knowledge transfer for dense prediction
Abstract
Knowledge distillation is the process of transferring knowledge from a more powerful large model (teacher) to a simpler counterpart (student). Numerous current approaches involve the student imitating the knowledge of the teacher directly, which tend to learn each spatial location’s features indiscriminately. Real-world datasets frequently exhibit noise, motivating models to acquire compact and representative features instead of memorizing …
Authors
Liu Z; Xu X; Cao Y; Shen W
Journal
Knowledge-Based Systems, Vol. 302, ,
Publisher
Elsevier
Publication Date
October 2024
DOI
10.1016/j.knosys.2024.112365
ISSN
0950-7051