Abstract | ||
---|---|---|
Deep Recurrent Neural Network architectures, though remarkably capable at modeling sequences, lack an intuitive high-level spatio-temporal structure. That is while many problems in computer vision inherently have an underlying high-level structure and can benefit from it. Spatio-temporal graphs are a popular tool for imposing such highlevel intuitions in the formulation of real world problems. In this paper, we propose an approach for combining the power of high-level spatio-temporal graphs and sequence learning success of Recurrent Neural Networks (RNNs). We develop a scalable method for casting an arbitrary spatio-temporal graph as a rich RNN mixture that is feedforward, fully differentiable, and jointly trainable. The proposed method is generic and principled as it can be used for transforming any spatio-temporal graph through employing a certain set of well defined steps. The evaluations of the proposed approach on a diverse set of problems, ranging from modeling human motion to object interactions, shows improvement over the state-of-the-art with a large margin. We expect this method to empower new approaches to problem formulation through high-level spatio-temporal graphs and Recurrent Neural Networks. |
Year | DOI | Venue |
---|---|---|
2015 | 10.1109/CVPR.2016.573 | 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR) |
Field | DocType | Volume |
Pattern recognition,Computer science,Recurrent neural network,Differentiable function,Time delay neural network,Artificial intelligence,Deep learning,Artificial neural network,Sequence learning,Machine learning,Feed forward,Scalability | Journal | abs/1511.05298 |
Issue | ISSN | Citations |
1 | 1063-6919 | 90 |
PageRank | References | Authors |
2.28 | 52 | 4 |
Name | Order | Citations | PageRank |
---|---|---|---|
Ashesh Jain | 1 | 98 | 2.80 |
Amir Roshan Zamir | 2 | 1262 | 40.17 |
Silvio Savarese | 3 | 3975 | 161.69 |
Ashutosh Saxena | 4 | 4575 | 227.88 |