Title
Learning from Audience Interaction: Multi-Instance Multi-Label Topic Model for Video Shots Annotating
Abstract
In recent years, audiences can find their interested TV play or movie videos by labels easily. However, for finding shots with certain semantic content in these videos, it is still a problem to annotate video shots by labels. Some existing approaches train models with annotated shots which cost a lot in labeling manually. Some other methods in solving this kind of task assume that the content of a video is only limited in the labels of the video. They ignore that the labels of a video are too coarse-grained to cover all content of the video. In this paper, we propose a multi-label, multi-instance topic model to annotate video shots by video labels. In a multi-label, multi-instance framework, video shots can be regarded as instances and shot labels are learned from labels in video level which makes the cost of labeling cheaper. On the other hand, our model learns label semantics by controlling the relationship between video labels and shots to solve coarse-grained problem. Furthermore, we also learn keywords for every video. The experiments on a large-scale real-world dataset show that our model outperforms other baseline models substantially.
Year
DOI
Venue
2021
10.1109/CSCWD49262.2021.9437805
PROCEEDINGS OF THE 2021 IEEE 24TH INTERNATIONAL CONFERENCE ON COMPUTER SUPPORTED COOPERATIVE WORK IN DESIGN (CSCWD)
Keywords
DocType
Citations 
video shots annotating, multi-label multi-instance, topic model, time-sync video comments
Conference
0
PageRank 
References 
Authors
0.34
0
5
Name
Order
Citations
PageRank
Zehua Zeng100.34
Neng Gao216.44
Cong Xue313.40
Yuanye He422.07
Xiaobo Guo502.03