Abstract | ||
---|---|---|
We propose a semi-supervised learning approach for video classification, VideoSSL, using convolutional neural networks (CNN). Like other computer vision tasks, existing supervised video classification methods demand a large amount of labeled data to attain good performance. However, annotation of a large dataset is expensive and time consuming. To minimize the dependence on a large annotated dataset, our proposed semi-supervised method trains from a small number of labeled examples and exploits two regulatory signals from unlabeled data. The first signal is the pseudo-labels of unlabeled examples computed from the confidences of the CNN being trained. The other is the normalized probabilities, as predicted by an image classifier CNN, that captures the information about appearances of the interesting objects in the video. We show that, under the supervision of these guiding signals from unlabeled examples, a video classification CNN can achieve impressive performances utilizing a small fraction of annotated examples on three publicly available datasets: UCF101, HMDB51, and Kinetics. |
Year | DOI | Venue |
---|---|---|
2021 | 10.1109/WACV48630.2021.00115 | 2021 IEEE WINTER CONFERENCE ON APPLICATIONS OF COMPUTER VISION (WACV 2021) |
DocType | ISSN | Citations |
Conference | 2472-6737 | 0 |
PageRank | References | Authors |
0.34 | 0 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Longlong Jing | 1 | 65 | 6.73 |
Toufiq Parag | 2 | 52 | 7.18 |
Zhe Wu | 3 | 55 | 5.93 |
Yingli Tian | 4 | 4062 | 249.81 |
Wang Hongcheng | 5 | 0 | 0.34 |