Abstract | ||
---|---|---|
Bag-of-words document representations are often used in text, image and video processing. While it is relatively easy to determine a suitable word dictionary for text documents, there is no simple mapping from raw images or videos to dictionary terms. The classical approach builds a dictionary using vector quantization over a large set of useful visual descriptors extracted from a training set, and uses a nearest-neighbor algorithm to count the number of occurrences of each dictionary word in documents to be encoded. More robust approaches have been proposed recently that represent each visual descriptor as a sparse weighted combination of dictionary words. While favoring a sparse representation at the level of visual descriptors, those methods however do not ensure that images have sparse representation. In this work, we use mixed-norm regularization to achieve sparsity at the image level as well as a small overall dictionary. This approach can also be used to encourage using the same dictionary words for all the images in a class, providing a discriminative signal in the construction of image representations. Experimental results on a benchmark image classification dataset show that when compact image or dictionary representations are needed for computational efficiency, the proposed approach yields better mean average precision in classification. |
Year | Venue | Field |
---|---|---|
2009 | NIPS | Video processing,Pattern recognition,K-SVD,Neural coding,Computer science,Sparse approximation,Regularization (mathematics),Vector quantization,Artificial intelligence,Contextual image classification,Discriminative model,Machine learning |
DocType | Citations | PageRank |
Conference | 50 | 2.53 |
References | Authors | |
12 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Samy Bengio | 1 | 7213 | 485.82 |
Fernando Pereira | 2 | 17717 | 2124.79 |
Y Singer | 3 | 13455 | 1559.02 |
Strelow, Dennis | 4 | 244 | 20.14 |