计算机科学
生成语法
对抗制
语音识别
语音处理
生成对抗网络
人工智能
自然语言处理
深度学习
作者
Aamir Wali,Zareen Alamgir,Saira Karim,Ather Fawaz,Mubariz Barkat Ali,Muhammad Adan,Malik Mujtaba
标识
DOI:10.1016/j.csl.2021.101308
摘要
Generative adversarial networks (GANs) have seen remarkable progress in recent years. They are used as generative models for all kinds of data such as text, images, audio, music, videos, and animations. This paper presents a comprehensive review of the novel and emerging GAN-based speech frameworks and algorithms that have revolutionized speech processing. We have categorized speech GANs based on application areas: speech synthesis, speech enhancement & conversion, and data augmentation in automatic speech recognition and emotion speech recognition systems. This review also includes a summary of the data sets and evaluation metrics commonly used in speech GANs. We also suggest some interesting research directions for future work and highlight the issues faced by current state-of-the-art speech GANs.
科研通智能强力驱动
Strongly Powered by AbleSci AI