Abstract | ||
---|---|---|
In this work we propose an audio-visual model for predicting temporal saliency in videos, that we validate and evaluate in an alternative way by employing fMRI data. We intend to bridge the gap between the large improvements achieved during the last years in computational modeling, especially in deep learning, and the neurobiological and behavioral research regarding human vision. The proposed audio-visual model incorporates both state-of-the-art deep architectures for visual saliency, which were trained on eye-tracking data, and behavioral findings concerning audio-visual integration in multimedia stimuli. A new fMRI database has been collected for evaluation purposes, that includes various videos and subjects. This dataset may prove useful not only for saliency but for other computer vision problems as well. The evaluation of our model using the new fMRI database under a mixed-effect analysis shows that the proposed saliency model has strong correlation with both the visual and audio brain areas, that confirms its effectiveness and appropriateness in predicting audio-visual saliency for dynamic stimuli. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/CVPRW.2018.00269 | IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops |
Field | DocType | ISSN |
Data modeling,Computer vision,Pattern recognition,Functional magnetic resonance imaging,Salience (neuroscience),Visualization,Computer science,Correlation,Artificial intelligence,Deep learning,Visual saliency | Conference | 2160-7508 |
Citations | PageRank | References |
1 | 0.35 | 0 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Petros Koutras | 1 | 16 | 6.35 |
Georgia Panagiotaropoulou | 2 | 2 | 1.04 |
Antigoni Tsiami | 3 | 13 | 4.02 |
Petros Maragos | 4 | 3733 | 591.97 |