Abstract | ||
---|---|---|
In this paper, we propose a novel model for recognizing human interaction in videos via discriminative patches. Each frame is represented as a set of mid-level discriminative patches, which are extracted automatically by association rule mining on convolutional neural networks (CNN) activations. We further refine these patches based on the observation that discriminative patches usually occur in climax period of an interaction. The climax of an interaction in the paper is defined as the continuous frames which have more firing patches. The patches are further purified by a reward-punishment rule, which ensures that the discriminative patches emerge in climax period or key frames frequently and seldom occur in non-key frames. Finally, the label of an interaction video clip is determined by votes of each patch detected in it. The experimental results on UT-Interaction Set # 1, Set # 2 and BIT-Interaction Dataset show that the proposed discriminative patches obtain encouraging performances. |
Year | DOI | Venue |
---|---|---|
2016 | 10.1007/978-3-319-54184-6_22 | COMPUTER VISION - ACCV 2016, PT II |
Field | DocType | Volume |
Pattern recognition,Computer science,Convolutional neural network,Human interaction,Speech recognition,Association rule learning,Artificial intelligence,Discriminative model | Conference | 10112 |
ISSN | Citations | PageRank |
0302-9743 | 0 | 0.34 |
References | Authors | |
0 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Dingyi Shan | 1 | 0 | 0.68 |
Laiyun Qing | 2 | 337 | 24.66 |
Jun Miao | 3 | 220 | 22.17 |