光谱图
波形
语音识别
模式识别(心理学)
人工智能
计算机科学
代表(政治)
声音(地理)
声学
物理
电信
政治学
政治
法学
雷达
作者
Haihui Chen,Lusen Ran,Xixia Sun,Chao Cai
标识
DOI:10.1109/icassp49357.2023.10096742
摘要
Anomalous Sound Detection (ASD) aims to identify whether the sound emitted from a machine is anomalous or not. Most advanced methods use 2-D CNNs to extract features of normal sounds from log-mel spectrograms for ASD. However, these methods can not fully exploit temporal information of log-mel spectrograms, resulting in poor performance on some machine types. In this paper, we propose a new framework for ASD named Spectrogram-Wavegram WaveNet (SW-WaveNet), which segments the 2-D log-mel spectrogram into 1-D waveform signals of different frequency bands and combines the representation vector extracted by WaveNet from segmented log-mel spectrograms and Wavegrams, respectively. The proposed framework utilizes WaveNet’s powerful capability of modeling waveform signals to effectively extract temporal information from log-mel spectrograms and Wavegrams. Experiments on the DCASE 2020 Challenge Task 2 dataset show that our framework achieves higher average AUC scores (93.25%) and pAUC scores (87.41%) than the previous works.
科研通智能强力驱动
Strongly Powered by AbleSci AI