Home
Scholarly Works
Learning-Based Video Compression Framework With...
Journal article

Learning-Based Video Compression Framework With Implicit Spatial Transform for Applications in the Internet of Things

Abstract

The rapid development of Big Data and network technology demands more secure and efficient video transmission for surveillance and video analysis applications. Classical video transmission relies on spatial-frequency transformation for compressing with loss but with limited coding efficiencies. The deep learning-based approach exceeds such limitations. In this work, we push the limit further by proposing an implicit spatial transform parameter method, which models the interframe redundancy to efficiently provide information for frame compression. Specifically, our method comprises a transform estimation module, which estimates the conversion from decoded frame to the current frame, and a context generator. The transform compensation and context generator produce a condensed high-dimensional context. Furthermore, we propose a P-frame CoDec for more efficient frame compression by removing the interframe redundancy. The proposed framework is extensible with a flexible context module. We demonstrate experimentally that our method outperforms previous methods by a large margin. Our method brings 34.817 more saved bit rate than H.265/HEVC. We also demonstrate 17.500 more bit rate saving and 0.490 dB gains in peak signal-to-noise ratio (PSNR) compared with the current state-of-the-art learning-based method proposed by Liu et al. (2022).

Authors

Li Q; Zhu S; Wang J; Chen T

Journal

IEEE Transactions on Industrial Informatics, Vol. 19, No. 5, pp. 6576–6587

Publisher

Institute of Electrical and Electronics Engineers (IEEE)

Publication Date

May 1, 2023

DOI

10.1109/tii.2022.3204681

ISSN

1551-3203

Contact the Experts team