Abstract | ||
---|---|---|
Many activities of interest are rare events, with only a few labeled examples available. Therefore models for temporal activity detection which are able to learn from a few examples are desirable. In this paper, we present a conceptually simple and general yet novel framework for few-shot temporal activity detection which detects the start and end time of the few-shot input activities in an untrimmed video. Our model is end-to-end trainable and can benefit from more few-shot examples. At test time, each proposal is assigned the label of the few-shot activity class corresponding to the maximum similarity score. Our Similarity R-C3D method outperforms previous work on three large-scale benchmarks for temporal activity detection (THUMOS14, ActivityNet1.2, and ActivityNet1.3 datasets) in the few-shot setting. Our code will be made available. |
Year | Venue | DocType |
---|---|---|
2018 | arXiv: Computer Vision and Pattern Recognition | Journal |
Volume | Citations | PageRank |
abs/1812.10000 | 1 | 0.35 |
References | Authors | |
20 | 6 |
Name | Order | Citations | PageRank |
---|---|---|---|
Huijuan Xu | 1 | 239 | 12.33 |
Bingyi Kang | 2 | 5 | 3.43 |
Ximeng Sun | 3 | 5 | 2.08 |
Jiashi Feng | 4 | 2165 | 140.81 |
kate saenko | 5 | 4478 | 202.48 |
Trevor Darrell | 6 | 22413 | 1800.67 |