Low Latency Video Denoising for Online Conferencing Using CNN Architectures
Abstract
In this paper, we propose a pipeline for real-time video denoising with low
runtime cost and high perceptual quality. The vast majority of denoising
studies focus on image denoising. However, a minority of research works
focusing on video denoising do so with higher performance costs to obtain
higher quality while maintaining temporal coherence. The approach we introduce
in this paper leverages the advantages of both image and video-denoising
architectures. Our pipeline first denoises the keyframes or one-fifth of the
frames using HI-GAN blind image denoising architecture. Then, the remaining
four-fifths of the noisy frames and the denoised keyframe data are fed into the
FastDVDnet video denoising model. The final output is rendered in the user's
display in real-time. The combination of these low-latency neural network
architectures produces real-time denoising with high perceptual quality with
applications in video conferencing and other real-time media streaming systems.
A custom noise detector analyzer provides real-time feedback to adapt the
weights and improve the models' output.