厌恶
愤怒
情感知觉
心理学
幸福
感知
集合(抽象数据类型)
认知心理学
面部表情
视听
情绪分类
情感表达
模态(人机交互)
模式
计算机科学
沟通
社会心理学
人工智能
多媒体
社会学
神经科学
程序设计语言
社会科学
作者
Houwei Cao,David G. Cooper,Michael K. Keutmann,Ruben C. Gur,Ani Nenkova,Ragini Verma
出处
期刊:IEEE Transactions on Affective Computing
[Institute of Electrical and Electronics Engineers]
日期:2014-09-25
卷期号:5 (4): 377-390
被引量:476
标识
DOI:10.1109/taffc.2014.2336244
摘要
People convey their emotional state in their face and voice. We present an audio-visual data set uniquely suited for the study of multi-modal emotion expression and perception. The data set consists of facial and vocal emotional expressions in sentences spoken in a range of basic emotional states (happy, sad, anger, fear, disgust, and neutral). 7,442 clips of 91 actors with diverse ethnic backgrounds were rated by multiple raters in three modalities: audio, visual, and audio-visual. Categorical emotion labels and real-value intensity values for the perceived emotion were collected using crowd-sourcing from 2,443 raters. The human recognition of intended emotion for the audio-only, visual-only, and audio-visual data are 40.9%, 58.2% and 63.6% respectively. Recognition rates are highest for neutral, followed by happy, anger, disgust, fear, and sad. Average intensity levels of emotion are rated highest for visual-only perception. The accurate recognition of disgust and fear requires simultaneous audio-visual cues, while anger and happiness can be well recognized based on evidence from a single modality. The large dataset we introduce can be used to probe other questions concerning the audio-visual perception of emotion.
科研通智能强力驱动
Strongly Powered by AbleSci AI