计算机科学
平均意见得分
试验装置
语音识别
语音增强
信号(编程语言)
语音处理
比例(比率)
阶段(地层学)
集合(抽象数据类型)
人工智能
工程类
古生物学
公制(单位)
运营管理
物理
量子力学
生物
降噪
程序设计语言
作者
Weixin Zhu,Zilin Wang,Jiuxin Lin,Chang Zeng,Tao Yu
标识
DOI:10.1109/icassp49357.2023.10094845
摘要
The ICASSP 2023 Speech Signal Improvement (SSI) Challenge concentrates on improving the speech signal quality of real-time communication (RTC) systems. In this paper, we introduce the speech signal improvement network (SSI-Net) submitted to the ICASSP 2023 SSI Challenge, which satisfies the real-time condition. The proposed SSI-Net has a multi-stage architecture. We present the time-domain restoration generative adversarial network (TRGAN) in the first restoration stage for speech restoration. Regarding the second enhancement stage, we employ a lightweight multi-scale temporal frequency convolutional network with axial self-attention (MTFAA-Net) called MTFAA-Lite to enhance the fullband speech. In the subjective test on the SSI Challenge blind test set, our proposed SSI-Net yields a P.835 overall mean opinion score (MOS) of 3.190 and a P.804 overall MOS of 3.178, which eventually takes the 3rd place in tracks 1&2.
科研通智能强力驱动
Strongly Powered by AbleSci AI