强化学习
计算机科学
杠杆(统计)
人工智能
编码器
约束(计算机辅助设计)
代表(政治)
卷积神经网络
数学
几何学
政治
政治学
法学
操作系统
作者
Yuan Jiang,Zhiguang Cao,Jie Zhang
出处
期刊:IEEE transactions on cybernetics
[Institute of Electrical and Electronics Engineers]
日期:2021-11-08
卷期号:53 (5): 2864-2875
被引量:28
标识
DOI:10.1109/tcyb.2021.3121542
摘要
Recently, there is a growing attention on applying deep reinforcement learning (DRL) to solve the 3-D bin packing problem (3-D BPP). However, due to the relatively less informative yet computationally heavy encoder, and considerably large action space inherent to the 3-D BPP, existing DRL methods are only able to handle up to 50 boxes. In this article, we propose to alleviate this issue via a DRL agent, which sequentially addresses three subtasks of sequence, orientation, and position, respectively. Specifically, we exploit a multimodal encoder, where a sparse attention subencoder embeds the box state to mitigate the computation while learning the packing policy, and a convolutional neural network subencoder embeds the view state to produce auxiliary spatial representation. We also leverage an action representation learning in the decoder to cope with the large action space of the position subtask. Besides, we integrate the proposed DRL agent into constraint programming (CP) to further improve the solution quality iteratively by exploiting the powerful search framework in CP. The experiments show that both the sole DRL and hybrid methods enable the agent to solve large-scale instances of 120 boxes or more. Moreover, they both could deliver superior performance against the baselines on instances of various scales.
科研通智能强力驱动
Strongly Powered by AbleSci AI