Title
Learning Navigation Behaviors End to End.
Abstract
A longstanding goal of behavior-based robotics is to solve high-level navigation tasks using end to end navigation behaviors that directly map sensors to actions. Navigation behaviors, such as reaching a goal or following a path without collisions, can be learned from exploration and interaction with the environment, but are constrained by the type and quality of a robotu0027s sensors, dynamics, and actuators. Traditional motion planning handles varied robot geometry and dynamics, but typically assumes high-quality observations. Modern vision-based navigation typically considers imperfect or partial observations, but simplifies the robot action space. With both approaches, the transition from simulation to reality can be difficult. Here, we learn two end to end navigation behaviors that avoid moving obstacles: point to point and path following. These policies receive noisy lidar observations and output robot linear and angular velocities. We train these policies in small, static environments with Shaped-DDPG, an adaptation of the Deep Deterministic Policy Gradient (DDPG) reinforcement learning method which optimizes reward and network architecture. Over 500 meters of on-robot experiments show , these policies generalize to new environments and moving obstacles, are robust to sensor, actuator, and localization noise, and can serve as robust building blocks for larger navigation tasks. The path following and point and point policies are 83% and 56% more successful than the baseline, respectively.
Year
Venue
Field
2018
arXiv: Robotics
Motion planning,Simulation,End-to-end principle,Network architecture,Artificial intelligence,Engineering,Point-to-point,Robot,Robotics,Actuator,Reinforcement learning
DocType
Volume
Citations 
Journal
abs/1809.10124
0
PageRank 
References 
Authors
0.34
0
4
Name
Order
Citations
PageRank
Hao-Tien Lewis Chiang1164.71
Aleksandra Faust26814.83
Marek Fiser3293.66
Anthony Francis4163.70