计算机科学
语音增强
语音识别
磁道(磁盘驱动器)
多媒体
卷积(计算机科学)
噪音(视频)
人工智能
人工神经网络
降噪
操作系统
图像(数学)
作者
Yukai Ju,Jun Chen,Shimin Zhang,Shulin He,Wei Rao,Weixin Zhu,Yannan Wang,Tao Yu,Shidong Shang
标识
DOI:10.1109/icassp49357.2023.10096838
摘要
This paper introduces the Unbeatable Team’s submission to the ICASSP 2023 Deep Noise Suppression (DNS) Challenge. We expand our previous work, TEA-PSE, to its upgraded version – TEA-PSE 3.0. Specifically, TEA-PSE 3.0 incorporates a residual LSTM after squeezed temporal convolution network (S-TCN) to enhance sequence modeling capabilities. Additionally, the local-global representation (LGR) structure is introduced to boost speaker information extraction, and multi-STFT resolution loss is used to effectively capture the time-frequency characteristics of the speech signals. Moreover, retraining methods are employed based on the freeze training strategy to fine-tune the system. According to the official results, TEA-PSE 3.0 ranks 1st in both ICASSP 2023 DNS-Challenge track 1 and track 2.
科研通智能强力驱动
Strongly Powered by AbleSci AI