Abstract | ||
---|---|---|
Bag-of-words approaches have been shown to achieve state-of-the-art performance in large-scale multimedia event detection. However, the commonly used histogram representation of bag-of-words requires large codebook sizes and expensive nonlinear kernel based classifiers for optimal performance. To address these two issues, we present a two-part generative model for compact visual representation, based on the i-vector approach recently proposed for speech and audio modeling. First, we use a Gaussian mixture model (GMM) to model the joint distribution of local descriptors. Second, we use a low-dimensional factor representation that constrains the GMM parameters to a subspace that preserves most of the information. We further extend this method to incorporate overlapping spatial regions, forming a highly compact visual representation that achieves superior performance with fast linear classifiers. We evaluate the method on a large video dataset used in the TRECVID 2011 MED evaluation. With linear classifiers, the proposed representation, with one-tenth of the storage footprint, outperforms soft quantization histograms used in the top performing TRECVID 2011 MED systems. |
Year | DOI | Venue |
---|---|---|
2013 | 10.1145/2502081.2502138 | ACM Multimedia 2001 |
Keywords | Field | DocType |
effective linear classification,two-part generative model,gmm parameter,gaussian mixture model,state-of-the-art performance,compact visual representation,compact bag-of-words,low-dimensional factor representation,histogram representation,optimal performance,superior performance,proposed representation,linear classifier,bag of words,generative model | Bag-of-words model,Kernel (linear algebra),Computer vision,Histogram,Pattern recognition,Computer science,TRECVID,Artificial intelligence,Linear classifier,Mixture model,Generative model,Codebook | Conference |
Citations | PageRank | References |
1 | 0.38 | 11 |
Authors | ||
3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Xiaodan Zhuang | 1 | 433 | 24.71 |
Shuang Wu | 2 | 171 | 7.23 |
Pradeep Natarajan | 3 | 14 | 2.14 |