强化学习
钢筋
食物运送
订单(交换)
计算机科学
人工智能
业务
心理学
营销
社会心理学
财务
作者
Xing Wang,Ling Wang,Chenxin Dong,Hao Ren,Ke Xing
出处
期刊:Tsinghua Science & Technology
[Tsinghua University Press]
日期:2023-09-21
卷期号:29 (2): 356-367
被引量:8
标识
DOI:10.26599/tst.2023.9010041
摘要
On-demand food delivery (OFD) is gaining more and more popularity in modern society. As a kernel order assignment manner in OFD scenario, order recommendation directly influences the delivery efficiency of the platform and the delivery experience of riders. This paper addresses the dynamism of the order recommendation problem and proposes a reinforcement learning solution method. An actor-critic network based on long short term memory (LSTM) unit is designed to deal with the order-grabbing conflict between different riders. Besides, three rider sequencing rules are accordingly proposed to match different time steps of the LSTM unit with different riders. To test the performance of the proposed method, extensive experiments are conducted based on real data from Meituan delivery platform. The results demonstrate that the proposed reinforcement learning based order recommendation method can significantly increase the number of grabbed orders and reduce the number of order-grabbing conflicts, resulting in better delivery efficiency and experience for the platform and riders.
科研通智能强力驱动
Strongly Powered by AbleSci AI