A Scalable Reinforcement Learning Algorithm for Scheduling Railway Lines
强化学习
计算机科学
可扩展性
调度(生产过程)
分布式计算
人工智能
数学优化
数学
操作系统
作者
Harshad Khadilkar
出处
期刊:IEEE Transactions on Intelligent Transportation Systems [Institute of Electrical and Electronics Engineers] 日期:2019-02-01卷期号:20 (2): 727-736被引量:39
标识
DOI:10.1109/tits.2018.2829165
摘要
This paper describes an algorithm for scheduling bidirectional railway lines (both single- and multi-track) using a reinforcement learning (RL) approach. The goal is to define the track allocations and arrival/departure times for all trains on the line, given their initial positions, priority, halt times, and traversal times, while minimizing the total priority-weighted delay. The primary advantage of the proposed algorithm compared to exact approaches is its scalability, and compared to heuristic approaches is its solution quality. Efficient scaling is ensured by decoupling the size of the state-action space from the size of the problem instance. Improved solution quality is obtained because of the inherent adaptability of reinforcement learning to specific problem instances. An additional advantage is that the learning from one instance can be transferred with minimal re-learning to another instance with different infrastructure resources and traffic mix. It is shown that the solution quality of the RL algorithm exceeds that of two prior heuristic-based approaches while having comparable computation times. Two lines from the Indian rail network are used for demonstrating the applicability of the proposed algorithm in the real world.