强化学习
适应性
调度(生产过程)
计算机科学
作业车间调度
学习效果
工作车间
流水车间调度
数学优化
工业工程
人工智能
工程类
数学
地铁列车时刻表
经济
微观经济学
操作系统
管理
作者
Haoxiang Wang,Bhaba R. Sarker,Jing Li,Jian Li
标识
DOI:10.1080/00207543.2020.1794075
摘要
To address the uncertainty of production environment in assembly job shop, in combination of the real-time feature of reinforcement learning, a dual Q-learning (D-Q) method is proposed to enhance the adaptability to environmental changes by self-learning for assembly job shop scheduling problem. On the basis of the objective function of minimising the total weighted earliness penalty and completion time cost, the top level Q-learning is focused on localised targets in order to find the dispatching policy which can minimise machine idleness and balance machine loads, and the bottom level Q-learning is focused on global targets in order to learn the optimal scheduling policy which can minimise the overall earliness of all jobs. Some theoretical results and simulation experiments indicate that the proposed algorithm achieves generally better results than the single Q-learning (S-Q) and other scheduling rules, under the arrival frequency of product with different conditions, and show good adaptive performance.Abbreviations: AFSSP, assembly flow shop scheduling problem; AJSSP, assembly job shop scheduling problem; RL, reinforcement learning; TASP, two-stage assembly scheduling problem
科研通智能强力驱动
Strongly Powered by AbleSci AI