Abstract | ||
---|---|---|
Many previous methods have demonstrated the importance of considering semantically relevant objects for carrying out video-based human activity recognition, yet none of the methods have harvested the power of large text corpora to relate the objects and the activities to be transferred into learning a unified deep convolutional neural network. We present a novel activity recognition CNN which co-learns the object recognition task in an end-to-end multitask learning scheme to improve upon the baseline activity recognition performance. We further improve upon the multitask learning approach by exploiting a text-guided semantic space to select the most relevant objects with respect to the target activities. To the best of our knowledge, we are the first to investigate this approach. |
Year | DOI | Venue |
---|---|---|
2018 | 10.1109/icassp.2019.8682698 | 2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP) |
Keywords | Field | DocType |
text-guided, CNN, activity recognition, object recognition, word2vec | Activity recognition,Multi-task learning,Convolutional neural network,Computer science,Text corpus,Artificial intelligence,Semantics,Machine learning,Semantic space,Cognitive neuroscience of visual object recognition | Journal |
Volume | ISSN | Citations |
abs/1805.01818 | 1520-6149 | 0 |
PageRank | References | Authors |
0.34 | 4 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sungmin Eum | 1 | 69 | 7.40 |
Christopher Reale | 2 | 17 | 2.29 |
Heesung Kwon | 3 | 400 | 37.09 |
Claire Bonial | 4 | 232 | 18.02 |
Clare R. Voss | 5 | 344 | 29.51 |