计算机科学
感知
情绪识别
一致性(知识库)
人工智能
自然语言处理
空格(标点符号)
情感计算
语音识别
主观性
任务(项目管理)
机器学习
人机交互
心理学
工程类
系统工程
神经科学
哲学
操作系统
认识论
作者
Shreya Upadhyay,Woan-Shiuan Chien,Bo-Hao Su,Chi-Chun Lee
出处
期刊:IEEE Transactions on Affective Computing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-31
卷期号:15 (3): 1539-1552
标识
DOI:10.1109/taffc.2024.3360428
摘要
Automatic sensing of emotional information in speech is important for numerous everyday applications. Conventional Speech Emotion Recognition (SER) models rely on averaging or consensus of human annotations for training, but emotions and raters' interpretations are subjective in nature, leading to diverse variations in perceptions. To address this, our proposed approach integrates the rater's subjectivity by forming the Perception-Coherent Clusters (PCC) of raters to be used to derive expanded label space for learning to improve SER. We evaluate our method on the IEMOCAP and the MSP-Podcast corpora, considering scenarios of fixed and variable raters, respectively. The proposed architecture, Rater Perception Coherency (RPC)-based SER surpasses single-task models with consensus labels by achieving UAR improvements of 3.39% for the IEMOCAP and 2.03% for the MSP-Podcast. Further analysis provides comprehensive insights into the contributions of these perception consistency clusters in SER learning.
科研通智能强力驱动
Strongly Powered by AbleSci AI