定向运动
启发式
计算机科学
超参数
车辆路径问题
旅行商问题
数学优化
指针(用户界面)
航程(航空)
贪婪算法
基线(sea)
布线(电子设计自动化)
组合优化
人工智能
算法
数学
工程类
计算机网络
海洋学
地质学
航空航天工程
作者
Wouter Kool,Herke van Hoof,Max Welling
出处
期刊:Cornell University - arXiv
日期:2018-01-01
被引量:407
标识
DOI:10.48550/arxiv.1803.08475
摘要
The recently presented idea to learn heuristics for combinatorial optimization problems is promising as it can save costly development. However, to push this idea towards practical implementation, we need better models and better ways of training. We contribute in both directions: we propose a model based on attention layers with benefits over the Pointer Network and we show how to train this model using REINFORCE with a simple baseline based on a deterministic greedy rollout, which we find is more efficient than using a value function. We significantly improve over recent learned heuristics for the Travelling Salesman Problem (TSP), getting close to optimal results for problems up to 100 nodes. With the same hyperparameters, we learn strong heuristics for two variants of the Vehicle Routing Problem (VRP), the Orienteering Problem (OP) and (a stochastic variant of) the Prize Collecting TSP (PCTSP), outperforming a wide range of baselines and getting results close to highly optimized and specialized algorithms.
科研通智能强力驱动
Strongly Powered by AbleSci AI