Excavating multimodal correlation for representation learning

计算机科学 模式 人工智能 杠杆(统计) 相关性 判别式 对比度(视觉) 特征学习 模态(人机交互) 机器学习 典型相关 代表(政治) 稳健性(进化) 集合(抽象数据类型) 样品(材料) 模式识别(心理学) 自然语言处理 数学 法学 程序设计语言 社会学 化学 几何学 基因 政治 生物化学 色谱法 社会科学 政治学
作者
Sijie Mai,Sun Ya,Ying Zeng,Haifeng Hu
出处
期刊:Information Fusion [Elsevier]
卷期号:91: 542-555 被引量:23
标识
DOI:10.1016/j.inffus.2022.11.003
摘要

A majority of previous methods for multimodal representation learning ignore the rich correlation information inherently stored in each sample, leading to a lack of robustness when trained on small datasets. Although a few contrastive learning frameworks leverage that information in a self-supervised manner, they generally encourage the intra-sample unimodal representations to be identical, neglecting the modality-specific information carried by individual modalities. In contrast, we propose a novel algorithm that learns the correlations between modalities to facilitate downstream multimodal tasks by leveraging the prior information across samples, and we explore the feasibility of the proposed method on elaborately designed unsupervised and supervised auxiliary learning tasks. Specifically, we construct the positive and negative sets for correlation learning as unimodal embeddings from the same sample and from different samples, respectively. A weak predictor is employed on the concatenated unimodal embeddings to learn the correspondence relationship for each set. In this way, the model can correlate unimodal features and discover the shared information across modalities. In contrast to contrastive learning methods, the proposed framework is compatible with any number of modalities and can retain modality-specific information, enabling multimodal representation to capture richer information. Moreover, in the supervised version, one of the main novelties is that the sample labels are further utilized to learn more discriminative features, where the assigned correlation scores of negative sets vary according to the label variations between the associated samples. Extensive experiments suggest that the proposed method reaches state-of-the-art performance on the tasks of multimodal sentiment analysis, emotion recognition, and humor detection, and can improve the performance of various fusion approaches.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
2秒前
西葫芦莲子粥完成签到,获得积分10
3秒前
开心最重要完成签到,获得积分10
3秒前
彩色小凡发布了新的文献求助10
3秒前
万能图书馆应助YY采纳,获得10
4秒前
zhenghua发布了新的文献求助30
4秒前
讲讲发布了新的文献求助10
6秒前
天天快乐应助陈梦鼠采纳,获得10
6秒前
彭于晏应助qaz采纳,获得10
7秒前
8秒前
天天快乐应助齐多达采纳,获得10
8秒前
斯文败类应助鳗鱼三毒采纳,获得10
8秒前
9秒前
ding应助as采纳,获得10
10秒前
10秒前
慕青应助小鑫爱科研采纳,获得10
10秒前
11秒前
xdy完成签到 ,获得积分10
12秒前
科研通AI5应助阳阳采纳,获得10
12秒前
13秒前
酷波er应助彩色小凡采纳,获得10
14秒前
rnmlp发布了新的文献求助10
14秒前
李健的小迷弟应助蹦跶采纳,获得10
14秒前
从容盼雁发布了新的文献求助30
14秒前
14秒前
科研通AI5应助YY采纳,获得10
14秒前
16秒前
桐桐应助落寞依珊采纳,获得10
17秒前
18秒前
凌云完成签到,获得积分10
19秒前
研友_VZG7GZ应助yangyajie采纳,获得10
20秒前
wanci应助Hollen采纳,获得10
21秒前
zhangnan完成签到,获得积分10
21秒前
酷波er应助香蕉八宝粥采纳,获得10
21秒前
可爱的函函应助LY采纳,获得10
21秒前
alex发布了新的文献求助10
22秒前
VDC应助观自在采纳,获得30
22秒前
科研通AI5应助YY采纳,获得10
22秒前
22秒前
补药啊发布了新的文献求助10
23秒前
高分求助中
Continuum Thermodynamics and Material Modelling 3000
Production Logging: Theoretical and Interpretive Elements 2700
Mechanistic Modeling of Gas-Liquid Two-Phase Flow in Pipes 2500
Structural Load Modelling and Combination for Performance and Safety Evaluation 1000
读者个体因素对汉语阅读中眼动行为的影响 710
Conference Record, IAS Annual Meeting 1977 610
電気学会論文誌D(産業応用部門誌), 141 巻, 11 号 510
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 基因 遗传学 物理化学 催化作用 量子力学 光电子学 冶金
热门帖子
关注 科研通微信公众号,转发送积分 3560229
求助须知:如何正确求助?哪些是违规求助? 3134388
关于积分的说明 9407260
捐赠科研通 2834527
什么是DOI,文献DOI怎么找? 1558164
邀请新用户注册赠送积分活动 727912
科研通“疑难数据库(出版商)”最低求助积分说明 716602