强化学习
计算机科学
调度(生产过程)
钢筋
工业工程
人工智能
运筹学
运营管理
工程类
结构工程
作者
Linshan Ding,Zailin Guan,Mudassar Rauf,Lei Yue
标识
DOI:10.1016/j.swevo.2024.101550
摘要
This study considers the simultaneous minimization of makespan and total tardiness for the multi-objective multiplicity flexible job shop scheduling problem (MOMFJSP). A deep reinforcement learning framework employing a multi-policy proximal policy optimization algorithm (MPPPO) is developed to solve MOMFJSP. The MOMFJSP is treated as a Markov decision process, allowing an intelligent agent to make sequential decisions based on the current production status. This framework involves multiple policy networks with different objective weight vectors. Using MPPPO, these networks are optimized simultaneously to obtain a set of high-quality Pareto-optimal policies. Moreover, a fluid model is introduced to extract state features and devise composite dispatching rules as discrete actions. A multi-policy co-evolution mechanism (MPCEM) is proposed to facilitate collaborative evolution among policy networks, supported by a reward mechanism that considers the objective weights. A training algorithm based on MPPPO is designed for learning across multiple policy networks. The effectiveness and superiority of the proposed method are confirmed through comparisons with composite dispatching rules and other scheduling methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI