模式
计算机科学
瓶颈
可扩展性
对话
任务(项目管理)
模态(人机交互)
人工智能
人机交互
机器学习
沟通
数据库
社会学
嵌入式系统
经济
管理
社会科学
作者
Feiyu Chen,Zhengxiao Sun,Deqiang Ouyang,Xueliang Liu,Jie Shao
标识
DOI:10.1145/3474085.3475661
摘要
Multi-sensory data has exhibited a clear advantage in expressing richer and more complex feelings, on the Emotion Recognition in Conversation (ERC) task. Yet, current methods for multimodal dynamics that aggregate modalities or employ additional modality-specific and modality-shared networks are still inadequate in balancing between the sufficiency of multimodal processing and the scalability to incremental multi-sensory data type additions. This incurs a bottleneck of performance improvement of ERC. To this end, we present MetaDrop, a differentiable and end-to-end approach for the ERC task that learns module-wise decisions across modalities and conversation flows simultaneously, which supports adaptive information sharing pattern and dynamic fusion paths. Our framework mitigates the problem of modelling complex multimodal relations while ensuring it enjoys good scalability to the number of modalities. Experiments on two popular multimodal ERC datasets show that MetaDrop achieves new state-of-the-art results.
科研通智能强力驱动
Strongly Powered by AbleSci AI