聚类分析
加权
计算机科学
人工智能
样品(材料)
特征(语言学)
情绪识别
基线(sea)
模式识别(心理学)
数据挖掘
概率分布
机器学习
语音识别
统计
数学
海洋学
放射科
地质学
哲学
医学
色谱法
化学
语言学
作者
Zhi-Kun Peng,Zhentao Liu,Meng-Ting Han
标识
DOI:10.1109/cac57257.2022.10054824
摘要
Speech emotion recognition (SER) is a key technology to achieve natural human-computer interaction. The development of SER is significantly influenced by the scale of the sample. In recent years, the study of SER has been intensified by introducing data augmentation methods. However, most of these methods directly augment the sample data scale, neglecting the rational analysis and utilization of the feature distribution of samples. In this paper, we propose a new framework for SER based on clustering assistance, which can utilize the feature distribution information of the sample data directly and effectively. It considers the sample proportion of each emotion category in the clusters obtained by clustering, converts it into a probability score, which is called the clustering emotion probability score, and fuses it with the emotion probability score from the simple classification model according to different fusion weighting factors. We evaluated the proposed method with the baseline model on the IEMOCAP dataset. Experimental results show that our method achieves better results than the baseline model in terms of both weighted and unweighted accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI