装箱问题
强化学习
计算机科学
启发式
箱子
人工智能
国家(计算机科学)
数学优化
机器学习
工作(物理)
订单(交换)
算法
工程类
数学
机械工程
财务
经济
作者
Francesca Guerriero,Francesco Paolo Saccomanno
标识
DOI:10.1109/idaacs58523.2023.10348703
摘要
Among machine learning paradigms, reinforcement learning aims to train an agent to operate in a dynamic environment in order to maximize the overall reward. By choosing in an appropriate way the reward, it is possible to find optimal solutions for many problems. This work, using a new reward concept, aims to train an agent to imitate a reference heuristic. In particular, the reward is proportional to the agent's ability to make the same choices of a particular heuristic, when applied to a given problem state. The proposed strategy is used to address the bin packing problem. The collected computational results show the validity of the proposed approach and the ability of the agent to outperform the reference algorithm.
科研通智能强力驱动
Strongly Powered by AbleSci AI