Abstract | ||
---|---|---|
A key problem with the automatic detection of semantic concepts (like 'interview' or 'soccer') in video streams is the manual acquisition of adequate training sets. Recently, we have proposed to use online videos downloaded from portals like youtube.com for this purpose, whereas tags provided by users during video upload serve as ground truth annotations. The problem with such training data is that it is weakly labeled: Annotations are only provided on video level, and many shots of a video may be "non-relevant", i.e. not visually related to a tag. In this paper, we present a probabilistic framework for learning from such weakly annotated training videos in the presence of irrelevant content. Thereby, the relevance of keyframes is modeled as a latent random variable that is estimated during training. In quantitative experiments on real-world online videos and TV news data, we demonstrate that the proposed model leads to a significantly increased robustness with respect to irrelevant content, and to a better generalization of the resulting concept detectors. |
Year | DOI | Venue |
---|---|---|
2008 | 10.1145/1386352.1386358 | CIVR |
Keywords | Field | DocType |
training data,relevant frame,online video,weakly annotated training video,tv news data,video stream,real-world online video,video level,irrelevant content,adequate training set,training concept detector,video upload,random variable,ground truth | Training set,Computer vision,Random variable,Information retrieval,Computer science,Upload,Robustness (computer science),Video tracking,Ground truth,Artificial intelligence,Detector,Video quality | Conference |
Citations | PageRank | References |
29 | 2.14 | 14 |
Authors | ||
4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Adrian Ulges | 1 | 328 | 26.61 |
Christian Schulze | 2 | 154 | 16.70 |
Daniel Keysers | 3 | 1737 | 140.59 |
Thomas M. Breuel | 4 | 2362 | 219.10 |