拖延
强化学习
计算机科学
动态优先级调度
调度(生产过程)
马尔可夫决策过程
数学优化
动作选择
作业车间调度
人工智能
马尔可夫过程
数学
地铁列车时刻表
操作系统
统计
神经科学
感知
生物
作者
Yong Gui,Dunbing Tang,Haihua Zhu,Yi Zhang,Zequn Zhang
标识
DOI:10.1016/j.cie.2023.109255
摘要
Due to the influence of dynamic changes in the manufacturing environment, a single dispatching rule (SDR) cannot consistently attain better results than other rules for dynamic scheduling problems. Dynamic selection of the most appropriate rule from several SDRs based on the Deep Q-Network (DQN) offers better scheduling performance than using an individual SDR. However, the discreteness of action space caused by the DQN and the simplicity of the action as an SDR limit the selection range and restrict performance improvement. Thus, in this paper, we propose a scheduling method based on deep reinforcement learning for the dynamic flexible job-shop scheduling problem (DFJSP), aiming to minimize the mean tardiness. Firstly, a Markov decision process with composite scheduling action is provided to elaborate the flexible job-shop dynamic scheduling process and transform the DFJSP into an RL task. Subsequently, a composite scheduling action aggregated by SDRs and continuous weight variables is designed to provide a continuous rule space and SDR weight selection. Moreover, a reward function related to mean tardiness performance criteria is designed such that maximizing the cumulative reward is equivalent to minimizing the mean tardiness. Finally, a policy network with states as inputs and weights as outputs is constructed to generate the scheduling decision at each decision point. Also, the deep deterministic policy gradient (DDPG) algorithm is used to train the policy network to select the most appropriate weights at each decision point, thereby aggregating the SDRs into a better rule. Results from numerical experiments reveal that the proposed scheduling method achieves significantly better scheduling results than an SDR and the DQN-based method in dynamically changeable manufacturing environments.
科研通智能强力驱动
Strongly Powered by AbleSci AI