Title
EEG-Video Emotion-Based Summarization: Learning With EEG Auxiliary Signals
Abstract
Video summarization is the process of selecting a subset of informative keyframes to expedite storytelling with limited loss of information. In this article, we propose an EEG-Video Emotion-based Summarization (EVES) model based on a multimodal deep reinforcement learning (DRL) architecture that leverages neural signals to learn visual interestingness to produce quantitatively and qualitatively better video summaries. As such, EVES does not learn from the expensive human annotations but the multimodal signals. Furthermore, to ensure the temporal alignment and minimize the modality gap between the visual and EEG modalities, we introduce a Time Synchronization Module (TSM) that uses an attention mechanism to transform the EEG representations onto the visual representation space. We evaluate the performance of EVES on the TVSum and SumMe datasets. Based on the rank order statistics benchmarks, the experimental results show that EVES outperforms the unsupervised models and narrows the performance gap with supervised models. Furthermore, the human evaluation scores show that EVES receives a higher rating than the state-of-the-art DRL model DR-DSN by 11.4% on the coherency of the content and 7.4% on the emotion-evoking content. Thus, our work demonstrates the potential of EVES in selecting interesting content that is both coherent and emotion-evoking.
Year
DOI
Venue
2022
10.1109/TAFFC.2022.3208259
IEEE Transactions on Affective Computing
Keywords
DocType
Volume
Video summarization,EEG-video representation,emotion-evoking,multimodality
Journal
13
Issue
ISSN
Citations 
4
1949-3045
0
PageRank 
References 
Authors
0.34
20
6
Name
Order
Citations
PageRank
Wai-Cheong Lincoln Lew100.34
Di Wang21337143.48
Kai Keng Ang380464.19
Joo-Hwee Lim478382.45
C. Quek532019.65
Ah-Hwee Tan61385112.07