Abstract | ||
---|---|---|
We present a system that automatically tags videos, i.e. detects high-level semantic concepts like objects or actions in them. To do so, our system does not rely on datasets manually annotated for research purposes. Instead, we propose to use videos from online portals like youtube.com as a novel source of training data, whereas tags provided by users during upload serve as ground truth annotations. This allows our system to learn autonomously by automatically downloading its training set. The key contribution of this work is a number of large-scale quantitative experiments on real-world online videos, in which we investigate the influence of the individual system components, and how well our tagger generalizes to novel content. Our key results are: (1) Fair tagging results can be obtained by a late fusion of several kinds of visual features. (2) Using more than one keyframe per shot is helpful. (3) To generalize to different video content (e.g., another video portal), the system can be adapted by expanding its training set. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1007/978-3-540-79547-6_40 | ICVS |
Keywords | Field | DocType |
novel source,different video content,training data,training set,novel content,real-world online video,online portal,key result,key contribution,individual system component,ground truth | Training set,Computer vision,Computer science,Upload,Ground truth,Artificial intelligence | Conference |
Volume | ISSN | ISBN |
5008 | 0302-9743 | 3-540-79546-4 |
Citations | PageRank | References |
14 | 0.86 | 15 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Adrian Ulges | 1 | 328 | 26.61 |
Christian Schulze | 2 | 154 | 16.70 |
Daniel Keysers | 3 | 1737 | 140.59 |
Thomas M. Breuel | 4 | 2362 | 219.10 |