计算机科学
数学优化
选择(遗传算法)
约束(计算机辅助设计)
人口
进化算法
强化学习
过程(计算)
最优化问题
人工智能
国家(计算机科学)
代表(政治)
算法
数学
几何学
人口学
社会学
政治
政治学
法学
操作系统
作者
Chao Wang,Zhihao Liu,Jianfeng Qiu,Lei Zhang
标识
DOI:10.1016/j.swevo.2024.101488
摘要
Constrained multi-objective optimization problems involve the optimization of multiple conflicting objectives simultaneously subject to a number of constraints, which pose a great challenge for the existing algorithms. When utilizing evolutionary algorithms to solve them, the constraint handling technique (CHT) plays a pivotal role in the environmental selection. Several CHTs, such as penalty functions, superiority of feasible solutions, and ϵ-constraint methods, have been developed. However, there are still some issues with the existing methods. On the one hand, different CHTs are typically better suited to specific problem and selecting the most appropriate CHT for a given problem is crucial. On the other hand, the suitability of CHTs may vary throughout different stages of the optimization process. Regrettably, limited attention has been given to the adaptive selection of CHTs. In order to address this research gap, we develop an adaptive CHT selection method based on deep reinforcement learning, allowing for the selection of CHTs that are tailored to different evolutionary states. In the proposed method, we adopt the deep Q-learning network to evaluate the impact of various CHTs and operators on the population state during evolution. Through a dynamic evaluation, the network adaptively outputs the most appropriate CHT and operator portfolio based on the current state of the population. Specifically, we propose novel state representation and reward calculation methods to accurately capture the performance of diverse actions across varying evolutionary states. Furthermore, to enhance network training, we introduce a two-stage training method that facilitates the collection of diverse samples. Moreover, this adaptive selection method can be easily integrated into the existing methods. The proposed algorithm is tested on 37 test problems, the optimal results can be achieved on 19 instances in terms of the inverted generational distance metric. Experimental results verify the proposed method generalizes well to different types of problems.
科研通智能强力驱动
Strongly Powered by AbleSci AI