计算机科学
卷积神经网络
音频信号
残余物
语音识别
人工智能
约束(计算机辅助设计)
音频分析器
模式识别(心理学)
代表(政治)
数字音频
信号(编程语言)
特征提取
语音编码
算法
数学
几何学
政治
政治学
法学
程序设计语言
作者
Safaa Allamy,Alessandro L. Koerich
标识
DOI:10.1109/ssci50451.2021.9659979
摘要
This paper proposes a 1D residual convolutional neural network (CNN) architecture for music genre classification and compares it with other recent 1D CNN architectures. The 1D CNNs learn a representation and a discriminant directly from the raw audio signal. Several convolutional layers capture the time-frequency characteristics of the audio signal and learn various filters relevant to the music genre recognition task. The proposed approach splits the audio signal into overlapped segments using a sliding window to comply with the fixed-length input constraint of the 1D CNNs. As a result, music genre classification can be carried out on a single audio segment or on aggregating the predictions on several audio segments, which improves the final accuracy. The performance of the proposed 1D residual CNN is assessed on a public dataset of 1,000 audio clips. The experimental results have shown that it achieves 80.93% of mean accuracy in classifying music genres and outperforms other 1D CNN architectures.
科研通智能强力驱动
Strongly Powered by AbleSci AI