隐马尔可夫模型
语音识别
计算机科学
混合模型
人工神经网络
边距(机器学习)
深层神经网络
模式识别(心理学)
帧(网络)
人工智能
声学模型
高斯分布
语音处理
机器学习
电信
物理
量子力学
作者
Geoffrey E. Hinton,Li Deng,Dong Yu,George E. Dahl,Abdelrahman Mohamed,Navdeep Jaitly,Andrew W. Senior,Vincent Vanhoucke,Patrick Nguyen,Tara N. Sainath,Brian Kingsbury
出处
期刊:IEEE Signal Processing Magazine
[Institute of Electrical and Electronics Engineers]
日期:2012-11-01
卷期号:29 (6): 82-97
被引量:8401
标识
DOI:10.1109/msp.2012.2205597
摘要
Most current speech recognition systems use hidden Markov models (HMMs) to deal with the temporal variability of speech and Gaussian mixture models (GMMs) to determine how well each state of each HMM fits a frame or a short window of frames of coefficients that represents the acoustic input. An alternative way to evaluate the fit is to use a feed-forward neural network that takes several frames of coefficients as input and produces posterior probabilities over HMM states as output. Deep neural networks (DNNs) that have many hidden layers and are trained using new methods have been shown to outperform GMMs on a variety of speech recognition benchmarks, sometimes by a large margin. This article provides an overview of this progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.
科研通智能强力驱动
Strongly Powered by AbleSci AI