Fast and Slow Learning of Recurrent Independent Mechanisms

计算机科学 心理学 人工智能
作者
Kanika Madan,Nan Rosemary Ke,Anirudh Goyal,Bernhard Schölkopf,Yoshua Bengio
出处
期刊:Cornell University - arXiv 被引量:10
标识
DOI:10.48550/arxiv.2105.08710
摘要

Decomposing knowledge into interchangeable pieces promises a generalization advantage when there are changes in distribution. A learning agent interacting with its environment is likely to be faced with situations requiring novel combinations of existing pieces of knowledge. We hypothesize that such a decomposition of knowledge is particularly relevant for being able to generalize in a systematic manner to out-of-distribution changes. To study these ideas, we propose a particular training framework in which we assume that the pieces of knowledge an agent needs and its reward function are stationary and can be re-used across tasks. An attention mechanism dynamically selects which modules can be adapted to the current task, and the parameters of the selected modules are allowed to change quickly as the learner is confronted with variations in what it experiences, while the parameters of the attention mechanisms act as stable, slowly changing, meta-parameters. We focus on pieces of knowledge captured by an ensemble of modules sparsely communicating with each other via a bottleneck of attention. We find that meta-learning the modular aspects of the proposed system greatly helps in achieving faster adaptation in a reinforcement learning setup involving navigation in a partially observed grid world with image-level input. We also find that reversing the role of parameters and meta-parameters does not work nearly as well, suggesting a particular role for fast adaptation of the dynamically selected modules.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
humble完成签到 ,获得积分10
刚刚
顺利的傲云完成签到,获得积分10
1秒前
NexusExplorer应助庾储采纳,获得10
3秒前
唐煜城发布了新的文献求助10
3秒前
小也发布了新的文献求助10
3秒前
4秒前
独立江湖女完成签到 ,获得积分10
5秒前
5秒前
忧心的碧完成签到,获得积分10
6秒前
haliw完成签到,获得积分10
6秒前
6秒前
eul完成签到,获得积分10
6秒前
6秒前
脑袋空空完成签到,获得积分10
7秒前
宇文无施发布了新的文献求助10
7秒前
听听完成签到,获得积分10
8秒前
xiuqing董完成签到,获得积分10
8秒前
简单修洁完成签到,获得积分10
9秒前
nczpf2010完成签到,获得积分10
10秒前
笨笨芯发布了新的文献求助10
10秒前
YB完成签到,获得积分10
11秒前
洪山老狗完成签到,获得积分10
11秒前
R喵喵完成签到 ,获得积分10
11秒前
11秒前
11秒前
小蘑菇应助Xiaoqiang采纳,获得10
11秒前
12秒前
南星发布了新的文献求助20
13秒前
FashionBoy应助天涯赤子采纳,获得10
13秒前
13秒前
13秒前
SciGPT应助笨笨芯采纳,获得10
14秒前
susong987完成签到,获得积分10
14秒前
科研通AI5应助17采纳,获得10
15秒前
16秒前
16秒前
16秒前
可爱的函函应助小也采纳,获得10
17秒前
研友_VZG7GZ应助自信的冬日采纳,获得10
17秒前
兑润泽完成签到,获得积分10
17秒前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
ISCN 2024 – An International System for Human Cytogenomic Nomenclature (2024) 3000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Izeltabart tapatansine - AdisInsight 800
Maneuvering of a Damaged Navy Combatant 650
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3773839
求助须知:如何正确求助?哪些是违规求助? 3319425
关于积分的说明 10194912
捐赠科研通 3034034
什么是DOI,文献DOI怎么找? 1664909
邀请新用户注册赠送积分活动 796398
科研通“疑难数据库(出版商)”最低求助积分说明 757433