Exploiting Objects with LSTMs for Video Categorization. - Citegraph

Paper Info

Title
Exploiting Objects with LSTMs for Video Categorization.

Abstract
Temporal dynamics play an important role for video classification. In this paper, we propose to leverage high-level semantic features to open the \"black box\" of the state-of-the-art temporal model, Long Short Term Memory (LSTM), with an aim to understand what is learned. More specifically, we first extract object features from a state-of-the-art CNN model that is trained to recognize 20K objects. Then we leverage LSTM with the extracted features as inputs to capture the temporal dynamics in videos. In combination with spatial and motion information, we achieve improvements for supervised video categorization. Furthermore, by masking the inputs, we demonstrate what is learned by LSTM, namely (i) which objects are crucial for recognizing a class-of-interest; (ii) how the LSTM model could assist the temporal localization of these detected objects.

Year	DOI	Venue
2016	10.1145/2964284.2967199	ACM Multimedia
Field	DocType	Citations
Black box (phreaking),Computer vision,Categorization,Masking (art),Computer science,Long short term memory,Artificial intelligence,Deep learning,Machine learning	Conference	5
PageRank	References	Authors
0.41	14	6

Authors (6 rows)

Cited by (5 rows)

References (14 rows)

Name	Order	Citations	PageRank
Yongqing Sun	1	6	3.46
Zuxuan Wu	2	496	29.79
Xi Wang	3	553	24.14
Hiroyuki Arai	4	64	13.25
Tetsuya Kinebuchi	5	9	3.17
Yu-Gang Jiang	6	3071	152.58

1