发布文献求助

Skill-Critic: Refining Learned Skills for Hierarchical Reinforcement Learning

精炼（冶金）钢筋强化学习计算机科学人工智能心理学材料科学社会心理学冶金

作者

Ce Hao,Catherine Weaver,Chen Tang,Kiyosumi Kawamoto,Masayoshi Tomizuka,Wei Zhan

出处

期刊：IEEE robotics and automation letters 日期：2024-02-21 卷期号：9 (4): 3625-3632 被引量：1

标识

DOI：10.1109/lra.2024.3368231

摘要

Hierarchical reinforcement learning (RL) can accelerate long-horizon decision-making by temporally abstracting a policy into multiple levels. Promising results in sparse reward environments have been seen with skills , i.e. sequences of primitive actions. Typically, a skill latent space and policy are discovered from offline data. However, the resulting low-level policy can be unreliable due to low-coverage demonstrations or distribution shifts. As a solution, we propose the Skill-Critic algorithm to fine-tune the low-level policy in conjunction with high-level skill selection. Our Skill-Critic algorithm optimizes both the low-level and high-level policies; these policies are initialized and regularized by the latent space learned from offline demonstrations to guide the parallel policy optimization. We validate Skill-Critic in multiple sparse-reward RL environments, including a new sparse-reward autonomous racing task in Gran Turismo Sport. The experiments show that Skill-Critic's low-level policy fine-tuning and demonstration-guided regularization are essential for good performance.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

更新

2024年影响因子查询已上线 (2024-6-20)

更新

大幅提高文件上传限制，最高150M (2024-4-1)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 通达完成签到，获得积分10

1秒前; 打打的应助被8023采纳，获得10

1秒前; 少堂发布了新的文献求助10

1秒前; 我是老大上传了应助文件

1秒前; 传奇3的应助被xjml2013采纳，获得10

3秒前; 西尔多发布了新的文献求助10

3秒前; JamesPei的应助被柯睿渊采纳，获得10

6秒前; 275891672发布了新的文献求助10

6秒前; 传奇3上传了应助文件

9秒前; xjml2013完成签到，获得积分10

10秒前; student完成签到，获得积分10

15秒前; 275891672完成签到，获得积分10

15秒前; xjml2013发布了新的文献求助10

16秒前; clamon完成签到，获得积分10

16秒前; 雪山飞龙发布了新的文献求助10

17秒前; 汉堡包上传了应助文件

18秒前; sxs完成签到，获得积分10

24秒前; 橙子上岸发布了新的文献求助10

25秒前; 研友_VZG7GZ的应助被Jin采纳，获得10

25秒前; 大模型上传了应助文件

26秒前; 小大夫完成签到，获得积分10

27秒前; zy完成签到，获得积分10

27秒前; InfoNinja的应助被luca采纳，获得20

28秒前; 可爱的函函的应助被xxh采纳，获得10

28秒前; 爆米花的应助被xxh采纳，获得10

28秒前; 发飙的蜗牛完成签到，获得积分10

28秒前; 斯文败类的应助被朴实觅波采纳，获得10

29秒前; WEN发布了新的文献求助30

29秒前; 希望天下0贩的0上传了应助文件

30秒前; xing发布了新的文献求助30

31秒前; 杨一发布了新的文献求助10

32秒前; 2113完成签到，获得积分10

33秒前; 哈哈哈啦啦只完成签到，获得积分10

33秒前; 动人的戎发布了新的文献求助10

33秒前; Jin完成签到，获得积分10

35秒前; 雪山飞龙发布了新的文献求助10

36秒前; 挽风完成签到，获得积分10

36秒前; gnr2000上传了应助文件

37秒前; 所所的应助被眼药水采纳，获得10

38秒前; 酷酷的笑白完成签到，获得积分20

38秒前

高分求助中: Impact of Mitophagy-Related Genes on the Diagnosis and Development of Esophageal Squamous Cell Carcinoma via Single-Cell RNA-seq Analysis and Machine Learning Algorithms 1600; Exploring Mitochondrial Autophagy Dysregulation in Osteosarcoma: Its Implications for Prognosis and Targeted Therapy 1500; LNG地下式貯槽指針（JGA指-107） 1000; 什么是会话分析 888; QMS18Ed2 | process management. 2nd ed 600; LNG as a marine fuel—Safety and Operational Guidelines - Bunkering 560; Clinical Interviewing, 7th ed 400

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 2942198; 求助须知：如何正确求助？哪些是违规求助？ 2601184; 关于积分的说明 7004369; 捐赠科研通 2242284; 什么是DOI，文献DOI怎么找？ 1190099; 版权声明 590254; 科研通“疑难数据库（出版商）”最低求助积分说明 582657

今日热心研友

西红柿炒番茄

干净的铅笔

坚强的广山

我不爱池鱼

神勇的雅香

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2024 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：826996720【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通