计算机科学
判决
解码方法
脑电图
语音识别
鉴定(生物学)
身份(音乐)
人工神经网络
人工智能
主题(文档)
自然语言处理
心理学
神经科学
植物
生物
电信
物理
图书馆学
声学
作者
Carlos Hernández García del Valle,Carolina Méndez‐Orellana,Christian Herff,María Rodríguez-Fernández
出处
期刊:Journal of Neural Engineering
[IOP Publishing]
日期:2024-10-18
标识
DOI:10.1088/1741-2552/ad88a3
摘要
Abstract Objetive. Decoding speech from brain activity can enable communication for individuals with speech disorders. Deep neural networks have shown great potential for speech decoding applications. However, the limited availability of large datasets containing neural recordings from speech-impaired subjects poses a challenge. Leveraging data from healthy participants can mitigate this limitation and expedite the development of speech neuroprostheses while minimizing the need for patient-specific training data. 
Approach. In this study, we collected a substantial dataset consisting of recordings from 56 healthy participants using 64 EEG channels. Multiple neural networks were trained to classify perceived sentences in the Spanish language using subject-independent, mixed-subjects, and fine-tuning approaches. The dataset has been made publicly available to foster further research in this area.
Main results. Our results demonstrate a remarkable level of accuracy in distinguishing sentence identity across 30 classes, showcasing the feasibility of training Deep Neural Networks (DNNs) to decode sentence identity from perceived speech using EEG. Notably, the subject-independent approach rendered accuracy comparable to the mixed-subjects approach, although with higher variability among subjects. Additionally, our fine-tuning approach yielded even higher accuracy, indicating an improved capability to adapt to individual subject characteristics, which enhances performance. This suggests that DNNs have effectively learned to decode universal features of brain activity across individuals while also being adaptable to specific participant data. Furthermore, our analyses indicate that EEGNet and DeepConvNet exhibit comparable performance, outperforming ShallowConvNet for sentence identity decoding. Finally, our Grad-CAM visualization analysis identifies key areas influencing the network's predictions, offering valuable insights into the neural processes underlying language perception and comprehension.
Significance. These findings advance our understanding of EEG-based speech perception decoding and hold promise for the development of speech neuroprostheses, particularly in scenarios where subjects cannot provide their own training data.
科研通智能强力驱动
Strongly Powered by AbleSci AI