Title
Disentangled Visual Representations for Extreme Human Body Video Compression
Abstract
Recent years have witnessed the great promise of deep neural video compression codecs. However, there are still unprecedented challenges ahead when the videos are expected to be encoded with extremely low bitrate. Motivated by recent attempts of layered conceptual image compression, we make the first attempt to leverage the disentangled visual representations for extreme human body video compression. More specifically, to capture the main structure, we adopt the inferred human pose keypoints as the structure code of each frame, thereby deriving the motion information from structure codes of adjacent frames for further compression. To better exploit the texture redundancy, all frames share the same texture codes by incorporating the proposed texture contrastive learning to ensure texture consistency within a video. Two branches are consequently transmitted in a separable manner, and the generator synthesizes the reconstructed video with the combination of all decoded representations at the decoder side. Both qualitative and quantitative experimental results demonstrate that the proposed scheme can produce perceptually pleasing reconstruction results in ultra-low bitrates far below that can be reached by other video codecs.
Year
DOI
Venue
2022
10.1109/ICME52920.2022.9859831
2022 IEEE International Conference on Multimedia and Expo (ICME)
Keywords
DocType
ISSN
Video compression,disentangled representations,Generative adversarial network,contrastive learning
Conference
1945-7871
ISBN
Citations 
PageRank 
978-1-6654-8564-7
0
0.34
References 
Authors
5
6
Name
Order
Citations
PageRank
Ruofan Wang100.68
Mao Qi2222.82
Shiqi Wang31281120.37
Chuanmin Jia426.78
Ronggang Wang513436.57
Siwei Ma62229203.42