马尔可夫决策过程
计算机科学
启发式
水准点(测量)
运筹学
强化学习
服务(商务)
数学优化
时间范围
状态空间
马尔可夫过程
人工智能
业务
工程类
数学
统计
大地测量学
营销
地理
作者
Yutong Gao,Shu Zhang,Zhiwei Zhang,Quanwu Zhao
标识
DOI:10.1016/j.cor.2024.106550
摘要
We introduce a stochastic share-a-ride problem in which a fleet of electric vehicles (EV) in a ride-hailing system are dynamically dispatched to serve passenger and parcel orders in a shared manner. We assume uncertain demands of both passenger and parcel orders and consider that passenger orders have priority over parcel orders. Passengers must be transported directly from their origins to destinations, while parcels can share a vehicle with other orders. The operator of the ride-hailing platform needs to decide whether to accept a newly arrived service request, how to assign orders to vehicles, and how to route and charge the EVs. To develop dynamic policies for the problem, we formulate the problem as a Markov decision process (MDP) and propose a reinforcement learning (RL) approach to solve the problem. We develop action-space restriction and state-space aggregation schemes to facilitate the implementation of the RL algorithm. We also present two rolling horizon heuristic methods to develop dynamic policies for our problem. We conduct computational experiments based on real-world taxi data from New York City. The computational results show that our RL policies perform better than the three benchmark policies in terms of serving more orders and collecting more rewards. Our RL policies are able to make high-quality decisions more efficiently when compared with the rolling horizon policies.
科研通智能强力驱动
Strongly Powered by AbleSci AI