Multimodal Co-learning: Challenges, applications with datasets, recent advances and future directions

模式 计算机科学 多模式学习 人工智能 机器学习 模态(人机交互) 实施 深度学习 资源(消歧) 计算机网络 社会科学 社会学 程序设计语言
作者
Anil Rahate,Rahee Walambe,Sheela Ramanna,Ketan Kotecha
出处
期刊:Information Fusion [Elsevier]
卷期号:81: 203-239 被引量:62
标识
DOI:10.1016/j.inffus.2021.12.003
摘要

Multimodal deep learning systems that employ multiple modalities like text, image, audio, video, etc., are showing better performance than individual modalities (i.e., unimodal) systems. Multimodal machine learning involves multiple aspects: representation, translation, alignment, fusion, and co-learning. In the current state of multimodal machine learning, the assumptions are that all modalities are present, aligned, and noiseless during training and testing time. However, in real-world tasks, typically, it is observed that one or more modalities are missing, noisy, lacking annotated data, have unreliable labels, and are scarce in training or testing, and or both. This challenge is addressed by a learning paradigm called multimodal co-learning. The modeling of a (resource-poor) modality is aided by exploiting knowledge from another (resource-rich) modality using the transfer of knowledge between modalities, including their representations and predictive models. Co-learning being an emerging area, there are no dedicated reviews explicitly focusing on all challenges addressed by co-learning. To that end, in this work, we provide a comprehensive survey on the emerging area of multimodal co-learning that has not been explored in its entirety yet. We review implementations that overcome one or more co-learning challenges without explicitly considering them as co-learning challenges. We present the comprehensive taxonomy of multimodal co-learning based on the challenges addressed by co-learning and associated implementations. The various techniques, including the latest ones, are reviewed along with some applications and datasets. Additionally, we review techniques that appear to be similar to multimodal co-learning and are being used primarily in unimodal or multi-view learning. The distinction between them is documented. Our final goal is to discuss challenges and perspectives and the important ideas and directions for future work that we hope will benefit for the entire research community focusing on this exciting domain.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
点点完成签到 ,获得积分10
刚刚
蜜CC发布了新的文献求助10
刚刚
刚刚
小蘑菇应助Revovler采纳,获得10
1秒前
小样完成签到,获得积分10
1秒前
1秒前
JamesPei应助平淡小白菜采纳,获得10
3秒前
Hello应助xiamu采纳,获得10
3秒前
zz发布了新的文献求助10
4秒前
5秒前
YuanbinMao应助尤咏慈采纳,获得10
5秒前
小蘑菇应助wonhui采纳,获得10
5秒前
6秒前
午见千山应助科研通管家采纳,获得10
6秒前
科研通AI2S应助科研通管家采纳,获得10
6秒前
Cassie应助科研通管家采纳,获得10
6秒前
辛勤凌旋发布了新的文献求助20
6秒前
spwan应助科研通管家采纳,获得10
6秒前
田様应助科研通管家采纳,获得10
6秒前
领导范儿应助科研通管家采纳,获得10
6秒前
共享精神应助科研通管家采纳,获得10
6秒前
科研通AI2S应助科研通管家采纳,获得10
6秒前
HCLonely应助科研通管家采纳,获得10
6秒前
香蕉觅云应助科研通管家采纳,获得10
6秒前
科研通AI2S应助科研通管家采纳,获得10
6秒前
6秒前
幸福果汁发布了新的文献求助10
11秒前
11秒前
123完成签到,获得积分10
13秒前
14秒前
Alex发布了新的文献求助10
15秒前
雨辉完成签到,获得积分10
16秒前
星辰大海应助罗燕采纳,获得10
16秒前
16秒前
16秒前
hetao发布了新的文献求助10
19秒前
jin完成签到,获得积分10
21秒前
22秒前
张北北发布了新的文献求助10
22秒前
闾丘惜寒应助灵巧的黑米采纳,获得20
25秒前
高分求助中
歯科矯正学 第7版(或第5版) 1004
Smart but Scattered: The Revolutionary Executive Skills Approach to Helping Kids Reach Their Potential (第二版) 1000
Semiconductor Process Reliability in Practice 720
PraxisRatgeber: Mantiden: Faszinierende Lauerjäger 700
GROUP-THEORY AND POLARIZATION ALGEBRA 500
Mesopotamian divination texts : conversing with the gods : sources from the first millennium BCE 500
Days of Transition. The Parsi Death Rituals(2011) 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 3228354
求助须知:如何正确求助?哪些是违规求助? 2876112
关于积分的说明 8193906
捐赠科研通 2543258
什么是DOI,文献DOI怎么找? 1373602
科研通“疑难数据库(出版商)”最低求助积分说明 646814
邀请新用户注册赠送积分活动 621333