Abstract | ||
---|---|---|
Current road-traffic optimisation practice around the world is a combination of hand tuned policies with a small degree of automatic adaption. Even state-of- the-art research controllers need good models of the road traffic, which cannot be obtained directly from existing sensors. We use a policy-gradient reinforce- ment learning approach to directly optimise the traffic signals, mapping currently deployed sensor observations to control signals. Our trained controllers are (theo- retically) compatible with the traffic system used in Sydney and many other cities around the world. We apply two policy-gradient methods: (1) the recent natural actor-critic algorithm, and (2) a vanilla policy-gradient algorithm for comparison. Along the way we extend natural-actor critic approaches to work for distributed and online infinite-horizon problems. |
Year | Venue | Keywords |
---|---|---|
2006 | NIPS | gradient method |
Field | DocType | Citations |
Computer science,Road traffic,Artificial intelligence,Traffic system,Machine learning,Reinforcement learning | Conference | 35 |
PageRank | References | Authors |
2.32 | 7 | 3 |
Name | Order | Citations | PageRank |
---|---|---|---|
Silvia Richter | 1 | 35 | 2.32 |
Douglas Aberdeen | 2 | 226 | 17.21 |
Jin Yu | 3 | 41 | 6.25 |