Deep Learning Based Angular Intra-Prediction for Lossless HEVC Video Coding - Citegraph

Paper Info

Title
Deep Learning Based Angular Intra-Prediction for Lossless HEVC Video Coding

Abstract
This work proposes the fi rst block -wise prediction paradigm based on CNNs for lossless video coding. The propose prediction scheme improves the HEVC performance by replacing a set of 9 angular intra-prediction modes with an improved CNN -based prediction; these include m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">6</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">10</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">14</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">18</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">22</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">26</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">30</sub> , m <sub xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">34</sub> . A causal neighborhood selecting a 16 x 16 block around the currently predicted block is used as input. The novel neural network model called Angular intra-Prediction Convolutional Neural Network (AP -CNN) is designed based on the U -net architecture and operates on three resolutions (16 x 16, 8 x 8,4 x 4). AP -CNN contains the following U-net structure: 10 convolutional layers (2 + 2, 2 + 2, 2) with (32, 64, 128) fi lters; 2 deconvolution layers with 32 and 64 fi lters; 2 filter concatenation layers. A final convolutional layer with one filter is used to compute a 16 x 16 block, which is further clipped out on the bottom-right corner to obtain the 4 x 4 output predicted block. AP-CNN uses a 3 x 3 window and ReLU and it employs the Adam optimizer with MSE loss function. The experimental assessment is carried out on the Y channel of two datasets: 15 HEVC Test Sequences on 8 -bit, and 7 TUT Sequences from Ultra Video Group (TUT-vSEQ). A model is trained for each of the 9 modes using a corresponding training set generated based on HEVC's optimal mode segmentation applied to 15 HD sequences from Xiph.org and a collection of RGB images from NYC Library. The size of each of the training sets varies between 6700 and 37300 batches, where one batch contains 500 samples (input blocks). Each AP -CNN model was trained during 20 epochs, and using a 90% -10% ratio for training -validation data splitting. Table 1 shows compression results for HEVC running under the lossless setting and with all intra-prediction, and for AP-CNN where Lossless HEVCIntra employs the CNN-based predictors. AP-CNN outperforms Lossless HEVC with an average bitrate improvement of around 0.85%. An increased performance is obtained on 1080p resolutions and above. The improved coding performance is due to AP-CNN's capability to combine linear and nonlinear CNN-based prediction models.

Year	DOI	Venue
2019	10.1109/DCC.2019.00091	2019 Data Compression Conference (DCC)
Keywords	Field	DocType
Deep Learning,Lossless Hevc,Angular Intra Prediction	Computer vision,Average bitrate,1080p,Convolutional neural network,Computer science,Algorithm,Concatenation,RGB color model,Artificial intelligence,Deep learning,Artificial neural network,Lossless compression	Conference
ISSN	ISBN	Citations
1068-0314	978-1-7281-0658-8	1
PageRank	References	Authors
0.40	0	3

Authors (3 rows)

Cited by (1 rows)

References (0 rows)

Name	Order	Citations	PageRank
Hongyue Huang	1	5	1.79
I. Schiopu	2	37	8.04
Adrian Munteanu	3	664	80.29

1