Title
DETECTING CATEGORIES IN NEWS VIDEO USING ACOUSTIC, SPEECH, AND IMAGE FEATURES
Abstract
Thiswork describessystemsfor detectingsemanticcategories present in news video. The multimedia data was processed in three ways: the audio signal was converted to a sequence of acoustic features, automatic speech recognition provided a word-level transcription, and image features were computed for selected frames of the video signal. Primary acoustic, speech, and vision systems were trained to discriminate in- stances of the categories. Higher-level systems exploited cor- relations among the categories, incorporated sequential con- text, and combined the joint evidence from the three informa- tion sources. We present experimental results from the TREC video retrieval evaluation.
Year
Venue
DocType
2006
TRECVID
Conference
Citations 
PageRank 
References 
6
1.28
5
Authors
7
Name
Order
Citations
PageRank
Slav Petrov12405107.56
Arlo Faria2667.87
Pascal Michaillat361.61
Alexander C. Berg410554630.24
Andreas Stolcke56690712.46
Dan Klein68083495.21
Jitendra Malik7394453782.10