Deep Multimodal Representation Learning: A Survey

计算机科学 模式 多模式学习 特征学习 深度学习 人工智能 钥匙(锁) 多模态 代表(政治) 抽象 透视图(图形) 生成语法 机器学习 数据科学 人机交互 万维网 认识论 哲学 社会学 法学 政治 计算机安全 社会科学 政治学
作者
Wenzhong Guo,Jianwen Wang,Shiping Wang
出处
期刊:IEEE Access [Institute of Electrical and Electronics Engineers]
卷期号:7: 63373-63394 被引量:300
标识
DOI:10.1109/access.2019.2916887
摘要

Multimodal representation learning, which aims to narrow the heterogeneity gap among different modalities, plays an indispensable role in the utilization of ubiquitous multimodal data. Due to the powerful representation ability with multiple levels of abstraction, deep learning-based multimodal representation learning has attracted much attention in recent years. In this paper, we provided a comprehensive survey on deep multimodal representation learning which has never been concentrated entirely. To facilitate the discussion on how the heterogeneity gap is narrowed, according to the underlying structures in which different modalities are integrated, we category deep multimodal representation learning methods into three frameworks: joint representation, coordinated representation, and encoder-decoder. Additionally, we review some typical models in this area ranging from conventional models to newly developed technologies. This paper highlights on the key issues of newly developed technologies, such as encoder-decoder model, generative adversarial networks, and attention mechanism in a multimodal representation learning perspective, which, to the best of our knowledge, have never been reviewed previously, even though they have become the major focuses of much contemporary research. For each framework or model, we discuss its basic structure, learning objective, application scenes, key issues, advantages, and disadvantages, such that both novel and experienced researchers can benefit from this survey. Finally, we suggest some important directions for future work.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
4秒前
当当完成签到,获得积分20
4秒前
当当发布了新的文献求助10
9秒前
杨文彬完成签到,获得积分20
10秒前
博弈春秋发布了新的文献求助50
10秒前
CR完成签到 ,获得积分10
10秒前
12秒前
12秒前
15秒前
wanci应助清新采纳,获得10
15秒前
浮云应助厦大的小学生采纳,获得30
17秒前
17秒前
17秒前
18秒前
19秒前
大模型应助七子采纳,获得10
22秒前
qian完成签到 ,获得积分10
22秒前
小黄发布了新的文献求助10
24秒前
tuanheqi应助汪鸡毛采纳,获得80
27秒前
厦大的小学生完成签到,获得积分20
27秒前
29秒前
32秒前
Orange应助LS采纳,获得10
32秒前
机器猫发布了新的文献求助40
33秒前
33秒前
清新发布了新的文献求助10
33秒前
深情安青应助望着拥有采纳,获得10
36秒前
Faith发布了新的文献求助10
36秒前
丘比特应助peanut采纳,获得10
36秒前
可樂发布了新的文献求助10
37秒前
山山完成签到,获得积分10
38秒前
李爱国应助张洋洋采纳,获得10
38秒前
maox1aoxin应助treelet007采纳,获得80
38秒前
42秒前
42秒前
43秒前
44秒前
44秒前
d叨叨鱼发布了新的文献求助10
45秒前
彭于晏应助Kelly1426采纳,获得10
45秒前
高分求助中
Tracking and Data Fusion: A Handbook of Algorithms 1000
Models of Teaching(The 10th Edition,第10版!)《教学模式》(第10版!) 800
La décision juridictionnelle 800
Rechtsphilosophie und Rechtstheorie 800
Nonlocal Integral Equation Continuum Models: Nonstandard Symmetric Interaction Neighborhoods and Finite Element Discretizations 600
Academic entitlement: Adapting the equity preference questionnaire for a university setting 500
消化器内視鏡関連の偶発症に関する第7回全国調査報告2019〜2021年までの3年間 500
热门求助领域 (近24小时)
化学 医学 材料科学 生物 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 免疫学 细胞生物学 电极
热门帖子
关注 科研通微信公众号,转发送积分 2876109
求助须知:如何正确求助?哪些是违规求助? 2487465
关于积分的说明 6735370
捐赠科研通 2170629
什么是DOI,文献DOI怎么找? 1153255
版权声明 585924
科研通“疑难数据库(出版商)”最低求助积分说明 566188