Abstract | ||
---|---|---|
The spatial pyramid and its variants have been among the most popular and successful models for object recognition. In these models, local visual features are coded across elements of a visual vocabulary, and then these codes are pooled into histograms at several spatial granularities. We introduce spatially local coding, an alternative way to include spatial information in the image model. Instead of only coding visual appearance and leaving the spatial coherence to be represented by the pooling stage, we include location as part of the coding step. This is a more flexible spatial representation as compared to the fixed grids used in the spatial pyramid models and we can use a simple, whole-image region during the pooling stage. We demonstrate that combining features with multiple levels of spatial locality performs better than using just a single level. Our model performs better than all previous single-feature methods when tested on the Caltech 101 and 256 object recognition datasets. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1007/978-3-642-37331-2_16 | ACCV (1) |
Keywords | Field | DocType |
object recognition,spatially local coding,spatial pyramid model,spatial pyramid,spatial coherence,spatial information,spatial granularity,coding step,spatial locality,local visual feature,flexible spatial representation | Spatial analysis,Computer vision,Caltech 101,Pattern recognition,Computer science,Pooling,Coding (social sciences),Artificial intelligence,Object-based spatial database,Pyramid,Cognitive neuroscience of visual object recognition,Visual appearance | Conference |
Citations | PageRank | References |
29 | 1.02 | 16 |
Authors | ||
2 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sancho McCann | 1 | 200 | 7.28 |
D. G. Lowe | 2 | 15718 | 1413.60 |