计算机科学
互补性(分子生物学)
卷积神经网络
人工智能
深层神经网络
多样性(控制论)
词汇
语音识别
期限(时间)
人工神经网络
模式识别(心理学)
机器学习
语言学
哲学
遗传学
物理
量子力学
生物
作者
Tara N. Sainath,Oriol Vinyals,Andrew Senior,Haşim Sak
出处
期刊:International Conference on Acoustics, Speech, and Signal Processing
日期:2015-04-01
被引量:1472
标识
DOI:10.1109/icassp.2015.7178838
摘要
Both Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) have shown improvements over Deep Neural Networks (DNNs) across a wide variety of speech recognition tasks. CNNs, LSTMs and DNNs are complementary in their modeling capabilities, as CNNs are good at reducing frequency variations, LSTMs are good at temporal modeling, and DNNs are appropriate for mapping features to a more separable space. In this paper, we take advantage of the complementarity of CNNs, LSTMs and DNNs by combining them into one unified architecture. We explore the proposed architecture, which we call CLDNN, on a variety of large vocabulary tasks, varying from 200 to 2,000 hours. We find that the CLDNN provides a 4-6% relative improvement in WER over an LSTM, the strongest of the three individual models.
科研通智能强力驱动
Strongly Powered by AbleSci AI