计算机科学
语音识别
可理解性(哲学)
字错误率
自回归模型
残余物
卡尔曼滤波器
语音增强
噪音(视频)
降噪
人工智能
算法
数学
统计
认识论
图像(数学)
哲学
作者
Nasir Saleem,Jiechao Gao,Muhammad Irfan Khattak,Hafiz Tayyab Rauf,Seifedine Kadry,Muhammad Shafi
标识
DOI:10.1016/j.knosys.2021.107914
摘要
With the recent research developments, deep learning models are powerful alternatives for speech enhancement and recognition in many real-world applications. Although state-of-the-art models achieve phenomenal results in terms of the background noise reduction, but the challenge is to design robust models for improving the quality, intelligibility, and word error rate. We propose a novel residual connection-based Bidirectional Gated Recurrent Unit (BiGRU) augmented Kalman filtering model for speech enhancement and recognition. In the proposed model, clean speech and noise signals are modeled as autoregressive process and the parameters are composed of linear prediction coefficients (LPCs) and driving noise variances. Recurrent neural networks are trained to estimate the line spectrum frequencies (LSFs) whereas an optimization problem is solved to attain noise variances such that to minimize the divergence between the modeled and predicted autoregressive spectrums of the noise contaminated speech. Augmented Kalman filtering with the estimated parameters are applied to the noisy speech for background noise reduction such that to improve the speech quality, intelligibility, and word error rates. Bidirectional GRUs network is implemented which predicts parameters both in the future and past contexts of the input sequence and outperform in terms of modeling the long-term dependencies. A compensated phase spectra is used to recover the enhanced speech signals. The Kaldi toolkit is employed to train the automatic speech recognition (ASR) system in order to measure the word error rates (WERs). By using the LibriSpeech dataset, the proposed model improved the quality, intelligibility, and word error rates by 35.52%, 18.79%, and 19.13%, respectively under various noisy environments.
科研通智能强力驱动
Strongly Powered by AbleSci AI