谈判
计算机科学
对手
重新使用
人工智能
应对(心理学)
钥匙(锁)
业务流程重组
机器学习
计算机安全
运筹学
运营管理
工程类
废物管理
心理学
法学
精益制造
精神科
政治学
作者
Leling Wu,Siqi Chen,Xiaoyang Gao,Zheng Yan,Jianye Hao
标识
DOI:10.1007/978-3-030-89370-5_2
摘要
Learning in automated negotiations, while successful for many tasks in recent years, is still hard when coping with different types of opponents with unknown strategies. It is critically essential to learn about the opponents from observations and then find the best response in order to achieve efficient agreements. In this paper, we propose a novel framework named Deep BPR+ (DBPR+) negotiating agent framework, which includes two key components: a learning module to learn a new coping policy when encountering an opponent using a previously unseen strategy, and a policy reuse mechanism to efficiently detect the strategy of an opponent and select the optimal response policy from the policy library. The performance of the proposed DBPR+ agent is evaluated against winning agents of ANAC competitions under varied negotiation scenarios. The experimental results show that DBPR+ agent outperforms existing state-of-the-art agents, and is able to make efficient detection and optimal response against unknown opponents.
科研通智能强力驱动
Strongly Powered by AbleSci AI