光谱图
波形
计算机科学
语音识别
人工智能
电信
雷达
作者
Ning Yu,Long Chen,Tao Leng,Zigang Chen,Xiaoyin Yi
标识
DOI:10.1016/j.jisa.2024.103720
摘要
Research on deepfake techniques for speech is crucial for combatting the spread of fake information, safeguarding public privacy, and advancing forensic techniques. However, the lack of transparency and explainability of spoofed speech detection models raises concerns about their reliability. In this paper, we suggest using raw waveform signals and spectrograms as fused features of the spoofed speech detection model. We use the SHAP method to analyze the feature distribution of spoofed speech detection and explain the likelihood of fake speech. Our experimental results demonstrate that our approach achieves better classification results with lighter model parameters than other feature fusion methods. Finally, the feature contribution values are calculated under the SHAP method to visualize them as heat maps. It helps researchers to analyze the feature distribution of spoofed speech to identify the most critical features that distinguish between spoofed and bona fide and to ensure transparency in their use.
科研通智能强力驱动
Strongly Powered by AbleSci AI