马尔可夫决策过程
计算机科学
强化学习
二部图
人工智能
匹配(统计)
操作员(生物学)
利润(经济学)
马尔可夫过程
自主代理人
数学优化
运筹学
理论计算机科学
数学
图形
生物化学
统计
化学
抑制因子
转录因子
微观经济学
经济
基因
作者
Tobias Enders,J. Harrison,Marco Pavone,Maximilian Schiffer
出处
期刊:Cornell University - arXiv
日期:2022-01-01
被引量:2
标识
DOI:10.48550/arxiv.2212.07313
摘要
We consider the sequential decision-making problem of making proactive request assignment and rejection decisions for a profit-maximizing operator of an autonomous mobility on demand system. We formalize this problem as a Markov decision process and propose a novel combination of multi-agent Soft Actor-Critic and weighted bipartite matching to obtain an anticipative control policy. Thereby, we factorize the operator's otherwise intractable action space, but still obtain a globally coordinated decision. Experiments based on real-world taxi data show that our method outperforms state of the art benchmarks with respect to performance, stability, and computational tractability.
科研通智能强力驱动
Strongly Powered by AbleSci AI