自编码
计算机科学
任务(项目管理)
培训(气象学)
语音识别
人工智能
模式识别(心理学)
训练集
深度学习
物理
气象学
经济
管理
作者
Chao-Yuan KAO,Sangwook Park,Alzahra Badi,David K. Han,Hanseok Ko
出处
期刊:IEICE Transactions on Information and Systems
[Institute of Electronics, Information and Communications Engineers]
日期:2020-04-30
卷期号:E103.D (5): 1195-1198
被引量:1
标识
DOI:10.1587/transinf.2019edl8183
摘要
Performance in Automatic Speech Recognition (ASR) degrades dramatically in noisy environments. To alleviate this problem, a variety of deep networks based on convolutional neural networks and recurrent neural networks were proposed by applying L1 or L2 loss. In this Letter, we propose a new orthogonal gradient penalty (OGP) method for Wasserstein Generative Adversarial Networks (WGAN) applied to denoising and despeeching models. WGAN integrates a multi-task autoencoder which estimates not only speech features but also noise features from noisy speech. While achieving 14.1% improvement in Wasserstein distance convergence rate, the proposed OGP enhanced features are tested in ASR and achieve 9.7%, 8.6%, 6.2%, and 4.8% WER improvements over DDAE, MTAE, R-CED(CNN) and RNN models.
科研通智能强力驱动
Strongly Powered by AbleSci AI