A high-performance neuroprosthesis for speech decoding and avatar control
- PMID: 37612505
- PMCID: PMC10826467
- DOI: 10.1038/s41586-023-06443-4
A high-performance neuroprosthesis for speech decoding and avatar control
Abstract
Speech neuroprostheses have the potential to restore communication to people living with paralysis, but naturalistic speed and expressivity are elusive1. Here we use high-density surface recordings of the speech cortex in a clinical-trial participant with severe limb and vocal paralysis to achieve high-performance real-time decoding across three complementary speech-related output modalities: text, speech audio and facial-avatar animation. We trained and evaluated deep-learning models using neural data collected as the participant attempted to silently speak sentences. For text, we demonstrate accurate and rapid large-vocabulary decoding with a median rate of 78 words per minute and median word error rate of 25%. For speech audio, we demonstrate intelligible and rapid speech synthesis and personalization to the participant's pre-injury voice. For facial-avatar animation, we demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures. The decoders reached high performance with less than two weeks of training. Our findings introduce a multimodal speech-neuroprosthetic approach that has substantial promise to restore full, embodied communication to people living with severe paralysis.
© 2023. The Author(s), under exclusive licence to Springer Nature Limited.
Conflict of interest statement
Figures
Comment in
-
Restoring speech.Nat Rev Neurosci. 2023 Nov;24(11):653. doi: 10.1038/s41583-023-00746-1. Nat Rev Neurosci. 2023. PMID: 37740095 No abstract available.
Similar articles
-
The speech neuroprosthesis.Nat Rev Neurosci. 2024 Jul;25(7):473-492. doi: 10.1038/s41583-024-00819-9. Epub 2024 May 14. Nat Rev Neurosci. 2024. PMID: 38745103 Review.
-
A high-performance speech neuroprosthesis.Nature. 2023 Aug;620(7976):1031-1036. doi: 10.1038/s41586-023-06377-x. Epub 2023 Aug 23. Nature. 2023. PMID: 37612500 Free PMC article.
-
A high-performance speech neuroprosthesis.bioRxiv [Preprint]. 2023 Apr 25:2023.01.21.524489. doi: 10.1101/2023.01.21.524489. bioRxiv. 2023. Update in: Nature. 2023 Aug;620(7976):1031-1036. doi: 10.1038/s41586-023-06377-x. PMID: 36711591 Free PMC article. Updated. Preprint.
-
Generalizable spelling using a speech neuroprosthesis in an individual with severe limb and vocal paralysis.Nat Commun. 2022 Nov 8;13(1):6510. doi: 10.1038/s41467-022-33611-3. Nat Commun. 2022. PMID: 36347863 Free PMC article.
-
Behavioral and Neural Foundations of Multisensory Face-Voice Perception in Infancy.Dev Neuropsychol. 2016 Jul-Dec;41(5-8):273-292. doi: 10.1080/87565641.2016.1255744. Epub 2017 Jan 6. Dev Neuropsychol. 2016. PMID: 28059567 Review.
Cited by
-
A bilingual speech neuroprosthesis driven by cortical articulatory representations shared between languages.Nat Biomed Eng. 2024 May 20. doi: 10.1038/s41551-024-01207-5. Online ahead of print. Nat Biomed Eng. 2024. PMID: 38769157
-
Fast, accurate, and interpretable decoding of electrocorticographic signals using dynamic mode decomposition.Commun Biol. 2024 May 18;7(1):595. doi: 10.1038/s42003-024-06294-3. Commun Biol. 2024. PMID: 38762683 Free PMC article.
-
Brain-machine-interface device translates internal speech into text.Nat Hum Behav. 2024 Jun;8(6):1014-1015. doi: 10.1038/s41562-024-01869-w. Nat Hum Behav. 2024. PMID: 38740991 No abstract available.
-
Representation of internal speech by single neurons in human supramarginal gyrus.Nat Hum Behav. 2024 Jun;8(6):1136-1149. doi: 10.1038/s41562-024-01867-y. Epub 2024 May 13. Nat Hum Behav. 2024. PMID: 38740984 Free PMC article.
-
A flexible intracortical brain-computer interface for typing using finger movements.bioRxiv [Preprint]. 2024 Apr 26:2024.04.22.590630. doi: 10.1101/2024.04.22.590630. bioRxiv. 2024. PMID: 38712189 Free PMC article. Preprint.
References
-
- Beukelman DR et al. Augmentative and Alternative Communication (Paul H. Brookes, 1998).
-
- Graves A, Fernández S, Gomez F & Schmidhuber J Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks. In Proc. 23rd International Conference on Machine learning - ICML ’06 (eds Cohen W & Moore A) 369–376 (ACM Press, 2006); 10.1145/1143844.1143891. - DOI
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Medical