马尔可夫决策过程
强化学习
计算机科学
调度(生产过程)
部分可观测马尔可夫决策过程
运筹学
操作员(生物学)
马尔可夫过程
电
电动汽车
需求响应
数学优化
马尔可夫链
马尔可夫模型
人工智能
工程类
机器学习
抑制因子
数学
生物化学
量子力学
转录因子
化学
功率(物理)
统计
物理
电气工程
基因
作者
Yanchang Liang,Zhaohao Ding,Tao Ding,Wei‐Jen Lee
出处
期刊:IEEE Transactions on Smart Grid
[Institute of Electrical and Electronics Engineers]
日期:2020-09-21
卷期号:12 (2): 1380-1393
被引量:111
标识
DOI:10.1109/tsg.2020.3025082
摘要
With the emerging concept of sharing-economy, shared electric vehicles (EVs) are playing a more and more important role in future mobility-on-demand traffic system. This article considers joint charging scheduling, order dispatching, and vehicle rebalancing for large-scale shared EV fleet operator. To maximize the welfare of fleet operator, we model the joint decision making as a partially observable Markov decision process (POMDP) and apply deep reinforcement learning (DRL) combined with binary linear programming (BLP) to develop a near-optimal solution. The neural network is used to evaluate the state value of EVs at different times, locations, and states of charge. Based on the state value, dynamic electricity prices and order information, the online scheduling is modeled as a BLP problem where the decision variables representing whether an EV will 1) take an order, 2) rebalance to a position, or 3) charge. We also propose a constrained rebalancing method to improve the exploration efficiency of training. Moreover, we provide a tabular method with proved convergence as a fallback option to demonstrate the near-optimal characteristics of the proposed approach. Simulation experiments with real-world data from Haikou City verify the effectiveness of the proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI