概率逻辑
计算机科学
残余物
加权
人工智能
机器学习
成对比较
蒸馏
卷积神经网络
深度学习
帧(网络)
算法
医学
电信
有机化学
放射科
化学
作者
Jiaming Cheng,Ruiyu Liang,Lin Zhou,Li Zhao,Chengwei Huang,Björn W. Schuller
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing
[Institute of Electrical and Electronics Engineers]
日期:2024-01-01
卷期号:32: 2680-2691
标识
DOI:10.1109/taslp.2024.3395978
摘要
In recent years, a great deal of research has focused on in developing neural network (NN)-based speech enhancement (SE) models, which have achieved promising results. However, NN-based models typically require expensive computations to achieve remarkable performance, constraining their deployment in real-world scenarios, especially when hardware resources are limited or when latency requirements are strict. To reduce this computational burden, we propose a unified residual fusion probabilistic knowledge distillation (KD) method for the SE task, in which knowledge is transferred from a deep teacher to a shallower student model. Previous KD approaches commonly focused on narrowing the output distances between teachers and students, but research on the intermediate representation of these models is lacking. In this paper, we first study the cross-layer residual feature fusion strategy, which enables the student model to distill knowledge contained in multiple teacher layers from shallow to deep. Second, a frame weighting probabilistic distillation loss is proposed to assign more emphasis to frames containing essential information and preserve pairwise probabilistic similarities in the representation space. The proposed distillation framework is applied to the dual-path dilated convolutional recurrent network (DPDCRN), which won the championship of the SE track in the L3DAS23 challenge. Extensive experiments are conducted on single-channel and multichannel SE datasets. Objective evaluations show that the proposed KD strategy outperforms other distillation methods and considerably improves the enhancement effect of the low-complexity student model (with only 17% of the teacher's parameters).
科研通智能强力驱动
Strongly Powered by AbleSci AI