Title | ||
---|---|---|
An Architecture for Learning "Potential Field" Cognitive Maps with an Application to Mobile Robotics |
Abstract | ||
---|---|---|
The learning architecture described in this article autonomously acquires a topographical (metric) map that encodes a measure of "value" for xy-Cartesian locations in an environment. There are two reasons for the creation of low value areas. Direct negative reinforcement from the environment will result from the robot discovering obstacles or having other "unpleasant" experiences. The other source of negative reinforcement is internally generated by the learning algorithm, as it identifies regions that are a long distance away from the "pleasant" places in the environment. Conversely example "pleasant" places, where positive environmental reward is received, might be energy-charging sites or simply locations that the robot should visit in executing its daily tasks. In general what the robot learns is a map of "motivational" tendencies, or "expectancies". In such a map, the value attached to a place comes to reflect a balance between the good and bad rewards attainable from that position. When the Temporal Difference learning part of the architecture is turned on, that measure of value comes to include an estimate of how far, in travel time, it is to positive reinforcement. The architecture is loosely based on an Adaptive Heuristic Critic structure. Exploration of a continuous-valued search space is conducted by an Evolution Strategy, tuned for fast and approximate optimization. Knowledge acquired autonomously from this exploration is stored in a Radial Basis Function (RBF) neural network. Inherent features of this neural network type lead to the creation of a "potential field" structure that exerts appetitive and aversive "forces" on the robot as it moves around in the environment. The results of simulation experiments are presented, with a view to illustrating the strengths and weaknesses of the architecture. The map building architecture proposed here is intended to form part of an overall navigational system. In future work it will be integrated with a self-localization algorithm, landmark-based topological mapping, and a reactive system for dealing with local dynamics in the environment. |
Year | DOI | Venue |
---|---|---|
2000 | 10.1177/105971230000800205 | ADAPTIVE BEHAVIOR |
Keywords | Field | DocType |
cognitive maps,mobile robotics,potential field maps,adaptive heuristic critic reinforcement learning,evolutionary computation,neural networks | Architecture,Cognitive map,Computer science,Evolutionary computation,Artificial intelligence,Artificial neural network,Robot,Reinforcement,Robotics,Form of the Good,Machine learning | Journal |
Volume | Issue | ISSN |
8.0 | 2 | 1059-7123 |
Citations | PageRank | References |
5 | 0.48 | 13 |
Authors | ||
1 |
Name | Order | Citations | PageRank |
---|---|---|---|
A. G. Pipe | 1 | 90 | 6.73 |