计算机科学
对话
话语
任务(项目管理)
水准点(测量)
背景(考古学)
语音识别
光学(聚焦)
粒度
自然语言处理
说话人识别
人工智能
利用
心理学
沟通
古生物学
生物
操作系统
物理
计算机安全
管理
大地测量学
光学
经济
地理
标识
DOI:10.1109/icassp48485.2024.10446410
摘要
Emotion recognition in conversation (ERC) has received extensive attention for its wide applications in recent years. Considering the actual situation, we focus on the real-time conversation scenarios, in which how to model the conversation emotion with only the historical contextual information and how to exploit the speaker information for emotion recognition have not been well studied. Therefore, we propose a novel multi-task learning model MCM-CSD, which combines the multi-granularity context modeling (MCM) based ERC with contrastive speaker detection (CSD). For the main task ERC, we design a bottom-up approach to fully extract the multi-granularity contextual information in both word and utterance levels. And for the auxiliary task CSD, we design a supervised contrastive learning loss that can easily distinguish previous speakers from the current speaker in multi-turn conversations. We conduct experiments on four benchmark datasets, the results show that our model can achieve state-of-the-art performance compared to previous methods. Furthermore, we perform ablation experiments and a case study to verify the effectiveness of each component and explain the significance of CSD in MCM-CSD. The code is available at https://github.com/WHOISJENNY/MCM-CSD.
科研通智能强力驱动
Strongly Powered by AbleSci AI