强化学习
旅行商问题
计算机科学
概化理论
一般化
任务(项目管理)
人工智能
比例(比率)
节点(物理)
骨料(复合)
动力学(音乐)
数学优化
机器学习
算法
数学
工程类
数学分析
物理
统计
复合材料
结构工程
量子力学
材料科学
系统工程
声学
作者
Yunqiu Xu,Meng Fang,Ling Chen,Yali Du,Gangyan Xu,Chengqi Zhang
标识
DOI:10.1016/j.aei.2023.102005
摘要
In this work, we study generalization in reinforcement learning for traveling salesman problem (TSP). While efforts have been made for designing deep reinforcement learning-based solvers to achieve near optimal results in small tasks, it is still an open problem to apply such solvers to larger-scale tasks by retaining performance. In this research, we learn the shared dynamics in TSP environments based on multi-task learning, which can be generalized to new tasks. To accurately estimate such dynamics, we consider leveraging the node visitation information. Besides designing RL-based models to attentively aggregate the visitation information during decision making, we propose a scheduled data utilization strategy to stabilize learning with various problem sizes. The experimental result shows that our model achieves improved generalizability for unseen larger TSPs in both zero-shot and few-shot settings.
科研通智能强力驱动
Strongly Powered by AbleSci AI