Abstract | ||
---|---|---|
This paper proposes a method to learn reconfigurable and sparse scene representation in the joint space of spatial configuration and appearance in a principled way. We call it the tangram model, which has three properties: (1) Unlike fixed structure of the spatial pyramid widely used in the literature, we propose a compositional shape dictionary organized in an And-Or directed acyclic graph (AOG) to quantize the space of spatial configurations. (2) The shape primitives (called tans) in the dictionary can be described by using any “off-the-shelf” appearance features according to different tasks. (3) A dynamic programming (DP) algorithm is utilized to learn the globally optimal parse tree in the joint space of spatial configuration and appearance. We demonstrate the tangram model in both a generative learning formulation and a discriminative matching kernel. In experiments, we show that the tangram model is capable of capturing meaningful spatial configurations as well as appearance for various scene categories, and achieves state-of-the-art classification performance on the LSP 15-class scene dataset and the MIT 67-class indoor scene dataset. |
Year | DOI | Venue |
---|---|---|
2012 | 10.1109/WACV.2012.6163023 | WACV |
Keywords | Field | DocType |
lsp 15-class scene dataset,optimal parse tree,indoor scene dataset,compositional shape dictionary,image representation,image matching,mit 67-class indoor scene dataset,trees (mathematics),tangram model,learning (artificial intelligence),various scene category,meaningful spatial configuration,discriminative matching kernel,and-or directed acyclic graph,state-of-the-art classification,generative learning,off-the-shelf appearance features,image classification,reconfigurable scene representation,joint space,natural scenes,sparse scene representation,directed graphs,dynamic programming algorithm,spatial pyramid,dynamic programming,reconfigurable sparse scene representation,spatial configuration,dictionaries,global optimization,shape,lattices,kernel,matching pursuit,directed acyclic graph,learning artificial intelligence | Kernel (linear algebra),Computer vision,Parse tree,Pattern recognition,Computer science,Directed graph,Directed acyclic graph,Artificial intelligence,Pyramid,Contextual image classification,Discriminative model,Generative model | Conference |
ISSN | ISBN | Citations |
1550-5790 E-ISBN : 978-1-4673-0232-6 | 978-1-4673-0232-6 | 12 |
PageRank | References | Authors |
0.60 | 10 | 5 |
Name | Order | Citations | PageRank |
---|---|---|---|
Jun Zhu | 1 | 1926 | 154.82 |
Tianfu Wu | 2 | 331 | 26.72 |
Song-Chun Zhu | 3 | 6580 | 741.75 |
Xiaokang Yang | 4 | 3581 | 238.09 |
Wenjun Zhang | 5 | 1789 | 177.28 |