摘要:AbstractIn this work, we introduce a scalable, decentralized deep reinforcement learning (DRL) scheme for controlling traffic signalization. The work builds on previous results using multi-agent DRL, with a new state representation and reward definitions. The state representation is a coarse image of traffic and the definitions of reward functions are tested based on the simulated Monaco SUMO Traffic (MoST) scenario. Based on extensive numerical experimentation, we have found the most appropriate choice of the reward function is related to minimizing the average amount of time vehicles spent in the network, but with various modifications that improve the learning process. The resulting algorithm performs better than the previous one on which it is based and markedly better than a non-learning based, greedy policy.
关键词:KeywordsIntelligent Transportationtraffic control systemsAutomatic controloptimizationreal-time operations in transportationreinforcement learning controlintegrated traffic management