Title
Deep Learning Based Angular Intra-Prediction for Lossless HEVC Video Coding
Abstract
This work proposes the fi rst block -wise prediction paradigm based on CNNs for lossless video coding. The propose prediction scheme improves the HEVC performance by replacing a set of 9 angular intra-prediction modes with an improved CNN -based prediction; these include m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">6</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">10</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">14</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">18</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">22</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">26</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">30</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">34</sub> . A causal neighborhood selecting a 16 x 16 block around the currently predicted block is used as input. The novel neural network model called Angular intra-Prediction Convolutional Neural Network (AP -CNN) is designed based on the U -net architecture and operates on three resolutions (16 x 16, 8 x 8,4 x 4). AP -CNN contains the following U-net structure: 10 convolutional layers (2 + 2, 2 + 2, 2) with (32, 64, 128) fi lters; 2 deconvolution layers with 32 and 64 fi lters; 2 filter concatenation layers. A final convolutional layer with one filter is used to compute a 16 x 16 block, which is further clipped out on the bottom-right corner to obtain the 4 x 4 output predicted block. AP-CNN uses a 3 x 3 window and ReLU and it employs the Adam optimizer with MSE loss function. The experimental assessment is carried out on the Y channel of two datasets: 15 HEVC Test Sequences on 8 -bit, and 7 TUT Sequences from Ultra Video Group (TUT-vSEQ). A model is trained for each of the 9 modes using a corresponding training set generated based on HEVC's optimal mode segmentation applied to 15 HD sequences from Xiph.org and a collection of RGB images from NYC Library. The size of each of the training sets varies between 6700 and 37300 batches, where one batch contains 500 samples (input blocks). Each AP -CNN model was trained during 20 epochs, and using a 90% -10% ratio for training -validation data splitting. Table 1 shows compression results for HEVC running under the lossless setting and with all intra-prediction, and for AP-CNN where Lossless HEVCIntra employs the CNN-based predictors. AP-CNN outperforms Lossless HEVC with an average bitrate improvement of around 0.85%. An increased performance is obtained on 1080p resolutions and above. The improved coding performance is due to AP-CNN's capability to combine linear and nonlinear CNN-based prediction models.
Year
DOI
Venue
2019
10.1109/DCC.2019.00091
2019 Data Compression Conference (DCC)
Keywords
Field
DocType
Deep Learning,Lossless Hevc,Angular Intra Prediction
Computer vision,Average bitrate,1080p,Convolutional neural network,Computer science,Algorithm,Concatenation,RGB color model,Artificial intelligence,Deep learning,Artificial neural network,Lossless compression
Conference
ISSN
ISBN
Citations 
1068-0314
978-1-7281-0658-8
1
PageRank 
References 
Authors
0.40
0
3
Name
Order
Citations
PageRank
Hongyue Huang151.79
I. Schiopu2378.04
Adrian Munteanu366480.29