噪音(视频)
计算机科学
估计员
噪声功率
噪声测量
噪声地板
数值噪声
光谱密度
梯度噪声
语音增强
最小均方误差
语音识别
人工智能
算法
功率(物理)
数学
降噪
统计
电信
物理
图像(数学)
量子力学
作者
Qiquan Zhang,Aaron Nicolson,Mingjiang Wang,Kuldip K. Paliwal,C.P. Wang
出处
期刊:IEEE/ACM transactions on audio, speech, and language processing
[Institute of Electrical and Electronics Engineers]
日期:2020-01-01
卷期号:28: 1404-1415
被引量:119
标识
DOI:10.1109/taslp.2020.2987441
摘要
An accurate noise power spectral density (PSD) tracker is an indispensable component of a single-channel speech enhancement system. Bayesian-motivated minimum mean-square error (MMSE)-based noise PSD estimators have been the most prominent in recent time. However, they lack the ability to track highly non-stationary noise sources due to current methods of a priori signal-to-noise (SNR) estimation. This is caused by the underlying assumption that the noise signal changes at a slower rate than the speech signal. As a result, MMSE-based noise PSD trackers exhibit a large tracking delay and produce noise PSD estimates that require bias compensation. Motivated by this, we propose an MMSE-based noise PSD tracker that employs a temporal convolutional network (TCN) a priori SNR estimator. The proposed noise PSD tracker, called DeepMMSE makes no assumptions about the characteristics of the noise or the speech, exhibits no tracking delay, and produces an accurate estimate that requires no bias correction. Our extensive experimental investigation shows that the proposed DeepMMSE method outperforms state-of-the-art noise PSD trackers and demonstrates the ability to track abrupt changes in the noise level. Furthermore, when employed in a speech enhancement framework, the proposed DeepMMSE method is able to outperform state-of-the-art noise PSD trackers, as well as multiple deep learning approaches to speech enhancement. Availability: DeepMMSE is available at: https://github.com/anicolson/DeepXi.
科研通智能强力驱动
Strongly Powered by AbleSci AI