计算机科学
强化学习
作业车间调度
马尔可夫决策过程
工作车间
人工智能
调度(生产过程)
元启发式
数学优化
机器学习
流水车间调度
马尔可夫过程
地铁列车时刻表
数学
统计
操作系统
作者
Jiang‐Ping Huang,Liang Gao,Xinyu Li
标识
DOI:10.1016/j.eswa.2023.121756
摘要
Distributed Job-shop Scheduling Problem (DJSP) is a hotspot in industrial and academic fields due to its valuable application in the real-life productions. For DJSP, the available methods aways complete the job selection first and then search for an appropriate factory to assign the selected job, which means job selection and job assignment are done independently. This paper proposes an end-to-end Deep Reinforcement Learning (DRL) method to make the two decisions simultaneously. To capture the problem characteristics and realize the objective optimization, the Markov Decision Process (MDP) of DJSP is formulated. Specialised action space made up of operation-factory pairs is designed to achieve the simultaneous decision-making. A stitched disjunctive graph representation of DJSP is specially designed, and a Graph Neural Network (GNN) based feature extraction architecture is proposed to dig the state embedding during problem solving. A Proximal Policy Optimization (PPO) method is applied to train an action-selection policy. To further lead the agent to assign jobs to the factory with smaller makespan, a probability enhancement mechanism is designed. The experimental results on 240 test instances have shown that the proposed method outperforms 8 classical Priority Dispatching Rules (PDRs), 3 closely-related RL methods and 5 metaheuristics in terms of effectiveness, stability and generalization.
科研通智能强力驱动
Strongly Powered by AbleSci AI