Title
MVVA-Net: a Video Aesthetic Quality Assessment Network with Cognitive Fusion of Multi-type Feature–Based Strong Generalization
Abstract
With the increasing popularity of short videos on various social media platforms, there is a great challenge for evaluating the aesthetic quality of these videos. In this paper, we first construct a large-scale and properly annotated short video aesthetics (SVA) dataset. We further propose a cognitive multi-type feature fusion network (MVVA-Net) for video aesthetic quality assessment. MVVA-Net consists of two branches: intra-frame aesthetics branch and inter-frame aesthetics branch. These two branches take different types of video frames as input. The inter-frame aesthetic branch extracts the inter-frame aesthetic features based on the sequential frames extracted at fixed intervals, and the intra-frame aesthetic branch extracts the intra-frame aesthetic features based on the key frames extracted by the inter-frame difference method. Through the adaptive fusion of inter-frame aesthetic features and intra-frame aesthetic features, the video aesthetic quality can be effectively evaluated. At the same time, MVVA-Net has no fixed number of input frames, which greatly enhances the generalization ability of the model. We performed quantitative comparison and ablation studies. The experimental results show that the two branches of MVVA-Net can effectively extract the intra-frame aesthetic features and inter-frame aesthetic features of different videos. Through the adaptive fusion of intra-frame aesthetic features and inter-frame aesthetic features for video aesthetic quality assessment, MVVA-Net achieves better classification performance and stronger generalization ability than other methods. In this paper, we construct a dataset of 6900 video shots and propose a video aesthetic quality assessment method based on non-fixed model input strategy and multi-type features. Experimental results show that the model has a strong generalization ability and achieved a good performance on different datasets.
Year
DOI
Venue
2022
10.1007/s12559-021-09947-1
Cognitive Computation
DocType
Volume
Issue
Journal
14
4
ISSN
Citations 
PageRank 
1866-9956
0
0.34
References 
Authors
5
4
Name
Order
Citations
PageRank
Min Li14920.98
Zheng Wang2434.79
Jinchang Ren3114488.54
Meijun Sun47411.77