Abstract | ||
---|---|---|
We present a system for concurrent activity recognition. To extract features associated with different activities, we propose a feature-to-activity attention that maps the extracted global features to sub-features associated with individual activities. To model the temporal associations of individual activities, we propose a transformer-network encoder that models independent temporal associations for each activity. To make the concurrent activity prediction aware of the potential associations between activities, we propose self-attention with an association mask. Our system achieved state-of-the-art or comparable performance on three commonly used concurrent activity detection datasets. Our visualizations demonstrate that our system is able to locate the important spatial-temporal features for final decision making. We also showed that our system can be applied to general multilabel classification problems. |
Year | Venue | DocType |
---|---|---|
2018 | arXiv: Computer Vision and Pattern Recognition | Journal |
Volume | Citations | PageRank |
abs/1812.02817 | 0 | 0.34 |
References | Authors | |
0 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Yanyi Zhang | 1 | 29 | 6.40 |
Xinyu Li | 2 | 88 | 37.72 |
Huang Kaixiang | 3 | 7 | 1.83 |
Yehan Wang | 4 | 0 | 0.34 |
Shuhong Chen | 5 | 6 | 2.84 |
Ivan Marsic | 6 | 716 | 91.96 |