A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction - Citegraph

Paper Info

Title
A Spatial-Temporal Recurrent Neural Network for Video Saliency Prediction

Abstract
In this paper, a recurrent neural network is designed for video saliency prediction considering spatial-temporal features. In our work, video frames are routed through the static network for spatial features and the dynamic network for temporal features. For the spatial-temporal feature integration, a novel select and re-weight fusion model is proposed which can learn and adjust the fusion weights based on the spatial and temporal features in different scenes automatically. Finally, an attention-aware convolutional long short term memory (ConvLSTM) network is developed to predict salient regions based on the features extracted from consecutive frames and generate the ultimate saliency map for each video frame. The proposed method is compared with state-of-the-art saliency models on five public video saliency benchmark datasets. The experimental results demonstrate that our model can achieve advanced performance on video saliency prediction.

Year	DOI	Venue
2021	10.1109/TIP.2020.3036749	IEEE Transactions on Image Processing
Keywords	DocType	Volume
Video saliency,spatial-temporal features,feature fusion,visual attention,deep learning	Journal	30
Issue	ISSN	Citations
1	1057-7149	0
PageRank	References	Authors
0.34	18	3

Authors (3 rows)

Cited by (0 rows)

References (18 rows)

Name	Order	Citations	PageRank
Kao Zhang	1	25	2.28
Zhenzhong Chen	2	1244	101.41
shan liu	3	96	49.62

1