Abstract | ||
---|---|---|
We present a new connectionist planning method [TML90]. By interaction with an unknown environment, a world model is progressively construc(cid:173) ted using gradient descent. For deriving optimal actions with respect to future reinforcement, planning is applied in two steps: an experience net(cid:173) work proposes a plan which is subsequently optimized by gradient descent with a chain of world models, so that an optimal reinforcement may be obtained when it is actually run. The appropriateness of this method is demonstrated by a robotics application and a pole balancing task. |
Year | Venue | Field |
---|---|---|
1990 | NIPS | Mathematical optimization,Gradient descent,Computer science,Pole balancing,Artificial intelligence,Reinforcement,Robotics,Machine learning,Connectionism |
DocType | ISBN | Citations |
Conference | 1-55860-184-8 | 8 |
PageRank | References | Authors |
3.55 | 3 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Sebastian Thrun | 1 | 20347 | 2302.56 |
Knut Möller | 2 | 59 | 34.75 |
Alexander Linden | 3 | 74 | 11.71 |