Title
Annotating news video with locations
Abstract
The location of video scenes is an important semantic descriptor especially for broadcast news video. In this paper, we propose a learning-based approach to annotate shots of news video with locations extracted from video transcript, based on features from multiple video modalities including syntactic structure of transcript sentences, speaker identity, temporal video structure, and so on. Machine learning algorithms are adopted to combine multi-modal features to solve two sub-problems: (1) whether the location of a video shot is mentioned in the transcript, and if so, (2) among many locations in the transcript, which are correct one(s) for this shot. Experiments on TRECVID dataset demonstrate that our approach achieves approximately 85% accuracy in correctly labeling the location of any shot in news video.
Year
DOI
Venue
2006
10.1007/11788034_16
CIVR
Keywords
Field
DocType
video scene,transcript sentence,annotating news video,temporal video structure,multiple video,syntactic structure,broadcast news video,news video,video transcript,learning-based approach,video shot,machine learning
Computer vision,Broadcasting,Parse tree,Computer science,TRECVID,Support vector machine,Image processing,Image retrieval,Speech recognition,Video tracking,Artificial intelligence,Smacker video
Conference
Volume
ISSN
ISBN
4071
0302-9743
3-540-36018-2
Citations 
PageRank 
References 
6
0.51
9
Authors
2
Name
Order
Citations
PageRank
Jun Yang193737.42
Alexander G. Hauptmann27472558.23