计算机科学
判别式
嵌入
一般化
班级(哲学)
编码(集合论)
人工智能
方案(数学)
语音识别
机器学习
基础(拓扑)
模式识别(心理学)
集合(抽象数据类型)
数学
数学分析
程序设计语言
作者
Wei Xie,Yanxiong Li,Qianhua He,Wenchang Cao
标识
DOI:10.1016/j.eswa.2023.120044
摘要
In real-world scenarios, new audio classes with insufficient samples usually emerge continually, which motivates the study of few-shot class-incremental audio classification (FCAC) in this paper. FCAC aims to enable the model to recognize new audio classes while remembering the base ones continually. To solve the FCAC problem, the discriminability of the prototypes is vital to the model’s classification performance. Thus, we proposed a method to learn the discriminative prototypes from two aspects. First, since the generalization ability of the embedding module (EM) significantly affects the discriminability of the prototypes, the proposed method employs a scheme of pseudo-episodic incremental training to train the EM by simulating the test scenario. Second, to enable the model to achieve a balanced classification performance on both base and new audio classes, the proposed method employs a selective-attention module to adjust different prototypes to enhance their discriminability. Extensive experimental results demonstrate that the proposed method achieves state-of-the-art performance in solving the FCAC problem. Notably, the proposed method achieves a comprehensive performance score (CPS) of 87.82% and 59.25% on the Neural Synthesis musical notes of 100 classes (NSynth-100) and Free sound clips of 89 classes (FSC-89) datasets, respectively, which outperforms the comparison methods. Our code is available at https://github.com/chester-w-xie/DPL_FCAC.
科研通智能强力驱动
Strongly Powered by AbleSci AI