计算机科学
卷积神经网络
语音识别
特征提取
人工智能
特征(语言学)
模式识别(心理学)
深度学习
面部识别系统
说话人识别
面子(社会学概念)
社会科学
语言学
哲学
社会学
作者
Jinghan Wu,Tao Zhao,Yakun Zhang,Liang Xie,Yan Ye,Erwei Yin
标识
DOI:10.1109/embc46164.2021.9630373
摘要
With the purpose of providing an external human-machine interaction platform for the elderly in need, a novel facial surface electromyography based silent speech recognition system was developed. In this study, we propose a deep learning architecture named Parallel-Inception Convolutional Neural Network (PICNN), and employ up-to-date feature extraction method log Mel frequency spectral coefficients (MFSC). To better meet the requirements of our target users, a 100-class dataset containing daily life-related demands was designed and generated for the comparative experiments. According to experimental results, the highest recognition accuracy of 88.44% was achieved by proposed recognition framework based on MFSC and PICNN, exceeding the performance of state-of-the-art deep learning algorithms such as CNN, VGGNet and Inception CNN (3.22%, 4.09% and 1.19%, respectively). These findings suggest that the newly developed silent speech approach holds promise to provide a more reliable communication channel, and the application scenery of speech recognition technology has been expanded at the same time.
科研通智能强力驱动
Strongly Powered by AbleSci AI