Title
Towards Efficient Learning of Optimal Spatial Bag-of-Words Representations
Abstract
Spatial Pyramid Matching (SPM) assumes that the spatial Bag-of-Words (BoW) representation is independent of data. However, evidence has shown that the assumption usually leads to a suboptimal representation. In this paper, we propose a novel method called Jensen-Shannon (JS) Tiling to learn the BoW representation from data directly at the BoW level. The proposed JS Tiling is especially appropriate for large-scale datasets as it is orders of magnitude faster than existing methods, but with comparable or even better classification precision. Experimental results on four benchmarks including two TRECVID12 datasets validate that JS Tiling outperforms the SPM and the state-of-the-art methods. The runtime comparison demonstrates that selecting BoW representations by JS Tiling is more than 1,000 times faster than running classifiers. Besides, JS Tiling is an important component contributing to CMU Teams' final submission in TRECVID 2012 Multimedia Event Detection.
Year
DOI
Venue
2014
10.1145/2578726.2578739
ICMR
Keywords
Field
DocType
suboptimal representation,js tiling,spatial pyramid matching,multimedia event detection,bow representation,bow level,trecvid12 datasets validate,optimal spatial bag-of-words representations,cmu teams,proposed js tiling,towards efficient learning,large-scale datasets,bag of visual words,spm
Bag-of-words model,Bag-of-words model in computer vision,Pattern recognition,Computer science,TRECVID,Artificial intelligence,Pyramid,Machine learning
Conference
Citations 
PageRank 
References 
8
0.48
20
Authors
4
Name
Order
Citations
PageRank
Jiang Lu175537.16
Wei Tong2532.75
Deyu Meng32025105.31
Alexander G. Hauptmann47472558.23