A Reinforcement Learning-Based Vehicle Platoon Control Strategy for Reducing Energy Consumption in Traffic Oscillations

强化学习 计算机科学 趋同(经济学) 状态空间 多智能体系统 分布式计算 控制(管理) 人工智能 数学 经济增长 统计 经济
作者
Meng Li,Zehong Cao,Zhibin Li
出处
期刊:IEEE transactions on neural networks and learning systems [Institute of Electrical and Electronics Engineers]
卷期号:32 (12): 5309-5322 被引量:33
标识
DOI:10.1109/tnnls.2021.3071959
摘要

The vehicle platoon will be the most dominant driving mode on future roads. To the best of our knowledge, few reinforcement learning (RL) algorithms have been applied in vehicle platoon control, which has large-scale action and state spaces. Some RL-based methods were applied to solve single-agent problems. If we need to tackle multiagent problems, we will use multiagent RL algorithms since the parameters space grows exponentially with the increasing number of agents involved. Previous multiagent RL algorithms generally may provide redundant information to agents, indicating a large amount of useless or unrelated information, which may cause to be difficult for convergence training and pattern extractions from shared information. Also, random actions usually contribute to crashes, especially at the beginning of training. In this study, a communication proximal policy optimization (CommPPO) algorithm was proposed to tackle the above issues. In specific, the CommPPO model adopts a parameter-sharing structure to allow the dynamic variation of agent numbers, which can well handle various platoon dynamics, including splitting and merging. The communication protocol of the CommPPO consists of two parts. In the state part, the widely used predecessor–leader follower typology in the platoon is adopted to transmit global and local state information to agents. In the reward part, a new reward communication channel is proposed to solve the spurious reward and “lazy agent” problems in some existing multiagent RLs. Moreover, a curriculum learning approach is adopted to reduce crashes and speed up training. To validate the proposed strategy for platoon control, two existing multiagent RLs and a traditional platoon control strategy were applied in the same scenarios for comparison. Results showed that the CommPPO algorithm gained more rewards and achieved the largest fuel consumption reduction (11.6%).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
活泼的曼寒完成签到,获得积分10
刚刚
研友_VZG7GZ应助yhltcm采纳,获得10
刚刚
gloria完成签到,获得积分10
1秒前
负责吃饭发布了新的文献求助10
1秒前
2秒前
3秒前
Dongbalal发布了新的文献求助10
3秒前
4秒前
4秒前
远方完成签到 ,获得积分10
4秒前
7秒前
威武冷雪完成签到,获得积分10
8秒前
皮皮发布了新的文献求助10
9秒前
AltairKing发布了新的文献求助10
10秒前
Frost完成签到,获得积分10
10秒前
12秒前
可爱的函函应助jade采纳,获得10
12秒前
FashionBoy应助活力小熊猫采纳,获得10
13秒前
mzbgnk发布了新的文献求助10
13秒前
呼吸之野应助hcch采纳,获得30
13秒前
是三石啊完成签到 ,获得积分10
13秒前
小黄同学爱学习完成签到 ,获得积分10
13秒前
14秒前
不爱吃韭菜完成签到 ,获得积分10
14秒前
想多睡会儿完成签到,获得积分10
18秒前
19秒前
Chency完成签到,获得积分10
22秒前
刘欢发布了新的文献求助10
22秒前
我的小熊去哪了完成签到,获得积分10
23秒前
眼睛大的画笔完成签到,获得积分10
23秒前
FashionBoy应助负责吃饭采纳,获得10
24秒前
27秒前
27秒前
jade完成签到,获得积分10
28秒前
薛定谔的猫完成签到,获得积分10
29秒前
Miyya完成签到,获得积分10
29秒前
29秒前
lj完成签到,获得积分10
31秒前
AltairKing完成签到,获得积分10
32秒前
jade发布了新的文献求助10
33秒前
高分求助中
Agaricales of New Zealand 1: Pluteaceae - Entolomataceae 1040
Healthcare Finance: Modern Financial Analysis for Accelerating Biomedical Innovation 1000
지식생태학: 생태학, 죽은 지식을 깨우다 600
Mantodea of the World: Species Catalog Andrew M 500
海南省蛇咬伤流行病学特征与预后影响因素分析 500
Neuromuscular and Electrodiagnostic Medicine Board Review 500
ランス多機能化技術による溶鋼脱ガス処理の高効率化の研究 500
热门求助领域 (近24小时)
化学 医学 材料科学 生物 工程类 有机化学 生物化学 纳米技术 内科学 物理 化学工程 计算机科学 复合材料 基因 遗传学 物理化学 催化作用 细胞生物学 免疫学 电极
热门帖子
关注 科研通微信公众号,转发送积分 3464245
求助须知:如何正确求助?哪些是违规求助? 3057540
关于积分的说明 9057583
捐赠科研通 2747637
什么是DOI,文献DOI怎么找? 1507432
科研通“疑难数据库(出版商)”最低求助积分说明 696553
邀请新用户注册赠送积分活动 696083