Adapt and explore: Multimodal mixup for representation learning

计算机科学 模式 多模式学习 人工智能 稳健性(进化) 机器学习 特征学习 模态(人机交互) 代表(政治) 桥接(联网) 利用 强化学习 编码器 计算机网络 社会科学 生物化学 化学 计算机安全 社会学 政治 政治学 法学 基因 操作系统
作者
Ronghao Lin,Haifeng Hu
出处
期刊:Information Fusion [Elsevier]
卷期号:105: 102216-102216 被引量:1
标识
DOI:10.1016/j.inffus.2023.102216
摘要

Research on general multimodal systems has gained significant attention due to the proliferation of multimodal data in the real world. Despite the remarkable performance achieved by existing multimodal representation learning schemes, missing modalities remain a persistent issue, thereby limiting the overall applicability of multimodal systems. Intending to address the issue, we propose a novel approach named M3ixup (Multi-Modal Mixup), which leverages the mixup strategy to improve unimodal and multimodal representation learning while simultaneously increasing robustness against missing modalities. First, we adopt productive multimodal learning scheme to model representations with modality-specific and joint-modality encoders. The general scheme ensuring the proposed approach transferable for various multimodal learning scenarios, including supervised, unsupervised, and reinforcement learning. Then, the unimodal input and manifold mixup is used to enhance the modality-specific encoders to capture intra-modal dynamics. Next, we present multimodal mixup to mix different modalities and generate mixed multimodal representations in adapting and exploring steps. The former step aims at bridging the huge information gaps between unimodal and multimodal representations in the joint space in the alignment, while the latter step further captures the inter-modal dynamics and exploits the non-linear relationships among different modalities. After that, the mixed views are aligned with the original multimodal representations by contrastive learning. Additionally, we innovatively extend the mixup strategy to the loss function of multimodal contrastive learning in two steps to improve the alignment between mixed and original views. Extensive experiments on public datasets in various multimodal learning scenarios demonstrate the superiority of the proposed M3ixup. The codes are available at https://github.com/RH-Lin/m3ixup.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
香蕉觅云应助易之采纳,获得10
刚刚
刚刚
1秒前
2秒前
yanxi完成签到,获得积分10
3秒前
4秒前
5秒前
7秒前
㙓㙓完成签到,获得积分20
9秒前
大模型应助可爱斩采纳,获得10
10秒前
12秒前
鸡蛋灌饼发布了新的文献求助10
13秒前
邓某完成签到,获得积分10
15秒前
Murphy119发布了新的文献求助10
17秒前
19秒前
GG发布了新的文献求助10
19秒前
秋霜完成签到 ,获得积分10
21秒前
科目三应助忧郁的平安采纳,获得10
21秒前
22秒前
23秒前
zz完成签到,获得积分20
24秒前
bobopoi应助zddhhh采纳,获得10
24秒前
24秒前
鸡蛋灌饼完成签到,获得积分10
25秒前
哟哟发布了新的文献求助10
25秒前
洋葱完成签到 ,获得积分10
28秒前
标致绿柏发布了新的文献求助10
28秒前
可爱斩完成签到,获得积分20
30秒前
32秒前
柴郡喵完成签到,获得积分10
32秒前
踏实的嵩完成签到,获得积分10
33秒前
宋荣升完成签到,获得积分10
34秒前
34秒前
帅气灯泡完成签到,获得积分10
35秒前
昏睡的世开完成签到,获得积分10
36秒前
40秒前
打打应助小太阳采纳,获得10
41秒前
42秒前
Nes完成签到,获得积分10
42秒前
小白发布了新的文献求助10
43秒前
高分求助中
Evolution 2024
Impact of Mitophagy-Related Genes on the Diagnosis and Development of Esophageal Squamous Cell Carcinoma via Single-Cell RNA-seq Analysis and Machine Learning Algorithms 2000
How to Create Beauty: De Lairesse on the Theory and Practice of Making Art 1000
Gerard de Lairesse : an artist between stage and studio 670
大平正芳: 「戦後保守」とは何か 550
Contributo alla conoscenza del bifenile e dei suoi derivati. Nota XV. Passaggio dal sistema bifenilico a quello fluorenico 500
Multiscale Thermo-Hydro-Mechanics of Frozen Soil: Numerical Frameworks and Constitutive Models 500
热门求助领域 (近24小时)
化学 医学 生物 材料科学 工程类 有机化学 生物化学 物理 内科学 纳米技术 计算机科学 化学工程 复合材料 基因 遗传学 催化作用 物理化学 免疫学 量子力学 细胞生物学
热门帖子
关注 科研通微信公众号,转发送积分 2996904
求助须知:如何正确求助?哪些是违规求助? 2657343
关于积分的说明 7192485
捐赠科研通 2292764
什么是DOI,文献DOI怎么找? 1215534
科研通“疑难数据库(出版商)”最低求助积分说明 593225
版权声明 592825