判别式
计算机科学
样品(材料)
人工智能
模态(人机交互)
特征(语言学)
机器学习
班级(哲学)
利用
采样(信号处理)
模式识别(心理学)
自然语言处理
语言学
计算机视觉
化学
色谱法
哲学
计算机安全
滤波器(信号处理)
作者
Meiling Li,Yifan Wei,Yangfu Zhu,Siqi Wei,Bin Wu
标识
DOI:10.1016/j.ins.2024.121282
摘要
Multimodal depression detection (MDD) has garnered significant interest in recent years. Current methods typically integrate multimodal information within samples to distinguish positive from negative samples, but they often neglect the relationships between samples. Despite similarities within the same class, individual variations exist. By leveraging these relationships, we can provide supervision signals for both inter- and intra-class samples, thereby enhancing the discriminative power of user representations. Inspired by this observation, we introduce IISFD, a novel approach that concurrently exploits intra-sample contrastive learning and inter-sample contrastive learning with hard negative sampling. This method comprehensively considers information both within individual samples and across samples. Specifically, we decompose the multimodal inputs of each sample, including audio, vision and text, into modality-common features and modality-specific features. To obtain better decomposed feature representations, we integrate intra-sample contrastive learning and inter-sample contrastive learning with hard negative sampling. Additionally, detailed modal information is obtained through unimodal reconstruction. By passing the decomposed features through a carefully designed multimodal fusion module, we obtain more discriminative user representations. Experimental results on two publicly available datasets demonstrate the superiority of our model, highlighting its effectiveness in leveraging both intra- and inter-sample information for enhanced MDD.
科研通智能强力驱动
Strongly Powered by AbleSci AI