A thousand words in a scene. - Citegraph

Paper Info

Title
A thousand words in a scene.

Abstract
This paper presents a novel approach for visual scene modeling and classification, investigating the combined use of text modeling methods and local invariant features. Our work attempts to elucidate (1) whether a text-like bag-of-visterms representation (histogram of quantized local visual features) is suitable for scene (rather than object) classification, (2) whether some analogies between discrete scene representations and text documents exist, and (3) whether unsupervised, latent space models can be used both as feature extractors for the classification task and to discover patterns of visual co-occurrence. Using several data sets, we validate our approach, presenting and discussing experiments on each of these issues. We first show, with extensive experiments on binary and multi-class scene classification tasks using a 9,500-image data set, that the bag-of-visterms representation consistently outperforms classical scene classification approaches. In other data sets we show that our approach competes with or outperforms other recent, more complex, methods. We also show that Probabilistic Latent Semantic Analysis (PLSA) generates a compact scene representation, discriminative for accurate classification, and more robust than the bag-of-visterms representation when less labeled training data is available. Finally, through aspect-based image ranking experiments, we show the ability of PLSA to automatically extract visually meaningful scene patterns, making such representation useful for browsing image collections.

Year	DOI	Venue
2007	10.1109/TPAMI.2007.1155	IEEE Trans. Pattern Anal. Mach. Intell.
Keywords	Field	DocType
classical scene classification approach,discrete scene representation,scene classifica- tion,multiclass scene classification task,compact scene representation,quantized local descriptors,accurate classification,meaningful scene pattern,index terms— image representation,thousand words,latent aspect modeling.,object recognition,visual scene modeling,classification task,bov representation,data mining,layout,detectors,image segmentation,text analysis,feature extraction,probability,image classification,construction industry,indexing terms,probabilistic latent semantic analysis	Computer vision,Histogram,Pattern recognition,Ranking,Computer science,Feature extraction,Image segmentation,Probabilistic latent semantic analysis,Artificial intelligence,Contextual image classification,Discriminative model,Cognitive neuroscience of visual object recognition	Journal
Volume	Issue	ISSN
29	9	0162-8828
Citations	PageRank	References
115	3.83	32
Authors
5

Search Limit

100115

Authors (5 rows)

Cited by (100 rows)

References (32 rows)

Name	Order	Citations	PageRank
Pedro Quelhas	1	261	21.51
Florent Monay	2	593	31.43
Jean-Marc Odobez	3	518	31.95
Daniel Gatica-Perez	4	4182	276.74
Tinne Tuytelaars	5	10161	609.66

1