计算机科学
人工智能
特征学习
监督学习
事件(粒子物理)
模式识别(心理学)
线性判别分析
特征(语言学)
半监督学习
机器学习
语音识别
编码器
人工神经网络
量子力学
操作系统
物理
哲学
语言学
作者
Juan Wei,Qian Zhang,Wenjun Ning
标识
DOI:10.1016/j.dsp.2023.104199
摘要
Most abnormal acoustic event detection (AAED) is completed by supervised training of deep learning methods, but manually labeled samples are costly and scarce. This work proposes a self-supervised learning representation for AAED based on contrastive learning to overcome the abovementioned problem. Auditory and visual data augmentations are applied simultaneously to create positive sample pairs. An attention mechanism is introduced into the encoder during self-supervised pre-training. A comparison between fused features by discriminant correlation analysis and a single feature is made to verify the ability of feature grasping for the self-supervised pre-trained model. The pre-training is completed on an abnormal acoustic dataset with noise. Research results show that the self-supervised pre-trained model can achieve an accuracy of 87.72% in linear evaluation and 88.70% in the downstream task with a pure small AAED dataset, which directly exceeds the results of supervised learning. This work releases the stress of the demand for abnormal acoustic event labels.
科研通智能强力驱动
Strongly Powered by AbleSci AI