Abstract | ||
---|---|---|
Action recognition has been an important and challenging task in computer vision. Existing approaches usually employ pooling operation to encode isolated patches or trajectories and then aggregate them for a compact video presentation. In this paper, we make two contributions towards improving action recognition accuracy and efficiency. First, we study to apply a state-of-the-art pooling technique used in image classification i.e. Generalized Max Pooling (GMP) to action recognition. Second, we propose an approach to improve GMP efficiency as it is applied to videos of which the number of extracted patches is enormous. The key idea is to compute the weighted vector block-by-block by exploiting sparse encoding vectors and inverted index. Experiments on benchmark dataset, HMDB51, have shown the significant performance of GMP compared to existing pooling techniques and the efficiency improvement of our proposed approach. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/KSE.2015.45 | 2015 Seventh International Conference on Knowledge and Systems Engineering (KSE) |
Keywords | DocType | Citations |
action recognition,video pooling,generalized max pooling,fisher vector,hard assignment,inverted index | Conference | 0 |
PageRank | References | Authors |
0.34 | 16 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
T. H. L. Nguyen | 1 | 120 | 36.75 |
Sang Phan | 2 | 27 | 7.40 |
Thanh Duc Ngo | 3 | 82 | 22.24 |