Abstract | ||
---|---|---|
This paper presents a video summarization technique for rushes that employs high-level feature fusion to identify segments for inclusion. It aims to capture distinct video events using a variety of features: k-means based weighting, speech, camera motion, significant differences in HSV color space, and a dynamic time warping (DTW) based feature that suppresses repeated scenes. The feature functions are used to drive a weighted k-means based clustering to identify visually distinct, important segments that constitute the final summary. The optimal weights corresponding to the individual features are obtained using a gradient descent algorithm that maximizes the recall of ground truth events from representative training videos. Analysis reveals a lengthy computation time but high quality results (60% average recall over 42 test videos) as based on manually-judged inclusion ofdistinct shots. The summaries were judged relatively easy to view and had an average amount of redundancy. |
Year | DOI | Venue |
---|---|---|
2007 | 10.1145/1290031.1290047 | TVS |
Keywords | Field | DocType |
redundancy pruning,high-level feature fusion,dynamic time warping,average recall,rush video summarization,feature function,representative training video,lengthy computation time,distinct video event,individual feature,manually-judged inclusion ofdistinct shot,average amount,k means,gradient descent,ground truth | HSL and HSV,Automatic summarization,Computer vision,Gradient descent,Weighting,Dynamic time warping,Pattern recognition,Computer science,Redundancy (engineering),Ground truth,Artificial intelligence,Cluster analysis | Conference |
Citations | PageRank | References |
15 | 0.78 | 8 |
Authors | ||
7 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jim Kleban | 1 | 198 | 9.81 |
Anindya Sarkar | 2 | 233 | 15.01 |
Emily Moxley | 3 | 156 | 8.95 |
Stephen Mangiat | 4 | 24 | 1.72 |
Swapna Joshi | 5 | 28 | 3.18 |
Thomas Kuo | 6 | 27 | 2.41 |
B. S. Manjunath | 7 | 7561 | 783.37 |