Dynamic scheduling for flexible job shop using a deep reinforcement learning approach

拖延强化学习计算机科学动态优先级调度调度（生产过程）马尔可夫决策过程数学优化动作选择作业车间调度人工智能马尔可夫过程数学地铁列车时刻表操作系统统计神经科学感知生物

作者

Yong Gui,Dunbing Tang,Haihua Zhu,Yi Zhang,Zequn Zhang

出处

期刊：Computers & Industrial Engineering [Elsevier]
日期：2023-04-21 卷期号：180: 109255-109255 被引量：132

标识

DOI：10.1016/j.cie.2023.109255

摘要

Due to the influence of dynamic changes in the manufacturing environment, a single dispatching rule (SDR) cannot consistently attain better results than other rules for dynamic scheduling problems. Dynamic selection of the most appropriate rule from several SDRs based on the Deep Q-Network (DQN) offers better scheduling performance than using an individual SDR. However, the discreteness of action space caused by the DQN and the simplicity of the action as an SDR limit the selection range and restrict performance improvement. Thus, in this paper, we propose a scheduling method based on deep reinforcement learning for the dynamic flexible job-shop scheduling problem (DFJSP), aiming to minimize the mean tardiness. Firstly, a Markov decision process with composite scheduling action is provided to elaborate the flexible job-shop dynamic scheduling process and transform the DFJSP into an RL task. Subsequently, a composite scheduling action aggregated by SDRs and continuous weight variables is designed to provide a continuous rule space and SDR weight selection. Moreover, a reward function related to mean tardiness performance criteria is designed such that maximizing the cumulative reward is equivalent to minimizing the mean tardiness. Finally, a policy network with states as inputs and weights as outputs is constructed to generate the scheduling decision at each decision point. Also, the deep deterministic policy gradient (DDPG) algorithm is used to train the policy network to select the most appropriate weights at each decision point, thereby aggregating the SDRs into a better rule. Results from numerical experiments reveal that the proposed scheduling method achieves significantly better scheduling results than an SDR and the DQN-based method in dynamically changeable manufacturing environments.

求助该文献

最长约 10秒，即可获得该文献文件

Dynamic scheduling for flexible job shop using a deep reinforcement learning approach

今日热心研友