Abstract | ||
---|---|---|
A novel deep convolution neural network is proposed to predict gaze on current frames in egocentric videos. Inspired by human visual system, we introduce a fovea module responsible for sharp central vision and name our model as Foveated Neural Network (FNN). The retina-like visual inputs from the region of interest on the previous frame are analysed and encoded. The fusion of the hidden representations of the previous frame and the feature maps of the current frame guides the gaze prediction on the current frame. In order to simulate motion, we also include the dense optical flow between these adjacent frames as additional input. Experimental results show that FNN outperforms the state-of-the-art algorithms in the publicly available egocentric dataset. The analysis of FNN demonstrates that the hidden representations of the foveated visual input from the previous frame as well as the motion information between adjacent frames are efficient in improving gaze prediction performance in egocentric videos. |
Year | Venue | Keywords |
---|---|---|
2017 | 2017 24TH IEEE INTERNATIONAL CONFERENCE ON IMAGE PROCESSING (ICIP) | Visual Attention, Saliency, Egocentric Videos, Gaze, Fovea |
Field | DocType | ISSN |
Computer vision,Pattern recognition,Gaze,Visualization,Convolutional neural network,Computer science,Salience (neuroscience),Human visual system model,Feature extraction,Artificial intelligence,Artificial neural network,Optical flow | Conference | 1522-4880 |
Citations | PageRank | References |
0 | 0.34 | 4 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Mengmi Zhang | 1 | 5 | 2.76 |
keng teck | 2 | 59 | 4.18 |
Joo-Hwee Lim | 3 | 783 | 82.45 |
Qi Zhao | 4 | 683 | 44.60 |