计算机科学
对偶(语法数字)
预处理器
语音识别
人工智能
模式识别(心理学)
文学类
艺术
作者
Qiupu Chen,Guimin Huang
标识
DOI:10.1016/j.engappai.2021.104277
摘要
Though the emotional state does not alter the content of language, it is a major determinant in human communication, because it provides much more positive feedback. The purpose of the speech emotion recognition is to automatically identify emotional or physiological state of a human being from their voice. In this paper, we propose a novel dual-level architecture, called dual attention-based bidirectional long short-term memory networks (dual attention-BLSTM) to recognize speech emotion. We also confirm that the recognition performance is better with different features as input than with only identical features in the dual-layer structure. Experiments on the IEMOCAP databases show the advantage of our proposed approach. The average recognition accuracy of our method is 70.29% in unweighted accuracy (UA) and the corresponding performance improvements are 2.89 compared to the best baseline methods. The results show that the architecture of our designed can better learn to distinguish features of the emotional information. • We propose a novel dual-level architecture, called dual attention-BLSTM for SER. • We propose a new data preprocessing mechanism by linear interpolation and decimation. • The performance is better by inputting different features in dual-layer structure.
科研通智能强力驱动
Strongly Powered by AbleSci AI