卷积(计算机科学)
计算机科学
卷积神经网络
代表(政治)
事件(粒子物理)
基础(线性代数)
模式识别(心理学)
人工智能
比例(比率)
人工神经网络
数学
量子力学
政治学
政治
物理
法学
几何学
作者
Jun Wang,Peng Yao,Feng Deng,Jianchao Tan,Chengru Song,Xiaorui Wang
标识
DOI:10.1109/icassp49357.2023.10096621
摘要
CNN+RNN models have become the mainstream approach for semi-supervised sound event detection, and the CNN part is mainly a stack of several 2D convolutional layers to capture the representations of the time-frequency features. However, conventional 2D convolution is of limited ability in capturing detailed information about acoustic events. In this paper, to enhance the representation ability of CNN, we propose NAS-DYMC, a NAS-based dynamic multi-scale convolutional neural network to extract a more effective acoustic representation. Specifically, multi-scale convolution can capture the characteristics of sound events with different time-frequency distributions and dynamic convolution enhances the representation capability of conventional convolution by adapting attention weights onto basis kernels. Furthermore, a neural architecture search (NAS) method is adopted to find the optimal network architecture from the search space consisting of various dynamic multi-scale convolutions for the DCASE 2021 Task4 dataset. Experimental results demonstrate the superiority of our proposed method.
科研通智能强力驱动
Strongly Powered by AbleSci AI