计算机科学
手势
语音识别
阿凡达
模式
神经假体
解码方法
心理学
人机交互
人工智能
神经科学
社会科学
电信
社会学
作者
Sean L. Metzger,Kaylo T. Littlejohn,Alexander B. Silva,David A. Moses,Margaret P. Seaton,Ran Wang,Maximilian E. Dougherty,Jessie R. Liu,Peter Wu,Michael A. Berger,Inga Zhuravleva,Adelyn Tu-Chan,Karunesh Ganguly,Gopala K. Anumanchipalli,Edward F. Chang
出处
期刊:Nature
[Springer Nature]
日期:2023-08-23
卷期号:620 (7976): 1037-1046
被引量:132
标识
DOI:10.1038/s41586-023-06443-4
摘要
Speech neuroprostheses have the potential to restore communication to people living with paralysis, but naturalistic speed and expressivity are elusive1. Here we use high-density surface recordings of the speech cortex in a clinical-trial participant with severe limb and vocal paralysis to achieve high-performance real-time decoding across three complementary speech-related output modalities: text, speech audio and facial-avatar animation. We trained and evaluated deep-learning models using neural data collected as the participant attempted to silently speak sentences. For text, we demonstrate accurate and rapid large-vocabulary decoding with a median rate of 78 words per minute and median word error rate of 25%. For speech audio, we demonstrate intelligible and rapid speech synthesis and personalization to the participant’s pre-injury voice. For facial-avatar animation, we demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures. The decoders reached high performance with less than two weeks of training. Our findings introduce a multimodal speech-neuroprosthetic approach that has substantial promise to restore full, embodied communication to people living with severe paralysis. A study using high-density surface recordings of the speech cortex in a person with limb and vocal paralysis demonstrates real-time decoding of brain activity into text, speech sounds and orofacial movements.
科研通智能强力驱动
Strongly Powered by AbleSci AI