辨别力
误传
视听
计算机科学
政治
模式
遮罩(插图)
语音识别
多媒体
社会学
艺术
认识论
计算机安全
视觉艺术
政治学
哲学
法学
社会科学
作者
Matthew Groh,Aruna Sankaranarayanan,Nikhil Singh,D Kim,Andrew Lippman,Rosalind W. Picard
标识
DOI:10.1038/s41467-024-51998-z
摘要
Abstract Recent advances in technology for hyper-realistic visual and audio effects provoke the concern that deepfake videos of political speeches will soon be indistinguishable from authentic video. We conduct 5 pre-registered randomized experiments with N = 2215 participants to evaluate how accurately humans distinguish real political speeches from fabrications across base rates of misinformation, audio sources, question framings with and without priming, and media modalities. We do not find base rates of misinformation have statistically significant effects on discernment. We find deepfakes with audio produced by the state-of-the-art text-to-speech algorithms are harder to discern than the same deepfakes with voice actor audio. Moreover across all experiments and question framings, we find audio and visual information enables more accurate discernment than text alone: human discernment relies more on how something is said, the audio-visual cues, than what is said, the speech content.
科研通智能强力驱动
Strongly Powered by AbleSci AI