Abstract | ||
---|---|---|
Modern deep convolutional neural networks (CNNs) for image classification and object detection are often trained offline on large static datasets. Some applications, however, will require training in real-time on live video streams with a human-in-the-loop. We refer to this class of problem as time-ordered online training (ToOT). These problems will require a consideration of not only the quantity of incoming training data, but the human effort required to annotate and use it. We demonstrate and evaluate a system tailored to training an object detector on a live video stream with minimal input from a human operator. We show that we can obtain bounding box annotation from weakly-supervised single-point clicks through interactive segmentation. Furthermore, by exploiting the time-ordered nature of the video stream through object tracking, we can increase the average training benefit of human interactions by 3-4 times. |
Year | Venue | Field |
---|---|---|
2018 | arXiv: Computer Vision and Pattern Recognition | Object detection,Annotation,Pattern recognition,Computer science,Segmentation,Convolutional neural network,Video tracking,Artificial intelligence,Contextual image classification,Detector,Minimum bounding box |
DocType | Volume | Citations |
Journal | abs/1803.10358 | 0 |
PageRank | References | Authors |
0.34 | 10 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ervin Teng | 1 | 1 | 1.72 |
Rui Huang | 2 | 2 | 1.04 |
Bob Iannucci | 3 | 41 | 10.62 |