A high-performance neuroprosthesis for speech decoding and avatar control

SL Metzger, KT Littlejohn, AB Silva, DA Moses… - Nature, 2023 - nature.com
Speech neuroprostheses have the potential to restore communication to people living with
paralysis, but naturalistic speed and expressivity are elusive. Here we use high-density …

The speech neuroprosthesis

AB Silva, KT Littlejohn, JR Liu, DA Moses… - Nature Reviews …, 2024 - nature.com
Loss of speech after paralysis is devastating, but circumventing motor-pathway injury by
directly decoding speech from intact cortical activity has the potential to restore natural …

Deep Speech Synthesis from MRI-Based Articulatory Representations

P Wu, T Li, Y Lu, Y Zhang, J Lian, AW Black… - arxiv preprint arxiv …, 2023 - arxiv.org
In this paper, we study articulatory synthesis, a speech synthesis method using human vocal
tract information that offers a way to develop efficient, generalizable and interpretable …

Neural latent aligner: cross-trial alignment for learning representations of complex, naturalistic neural data

CJ Cho, E Chang… - … Conference on Machine …, 2023 - proceedings.mlr.press
Understanding the neural implementation of complex human behaviors is one of the major
goals in neuroscience. To this end, it is crucial to find a true representation of the neural …

Slim: Style-linguistics mismatch model for generalized audio deepfake detection

Y Zhu, S Koppisetti, T Tran, G Bharaj - arxiv preprint arxiv:2407.18517, 2024 - arxiv.org
Audio deepfake detection (ADD) is crucial to combat the misuse of speech synthesized from
generative AI models. Existing ADD models suffer from generalization issues, with a large …

Improving speech inversion through self-supervised embeddings and enhanced tract variables

AA Attia, YM Siriwardena… - 2024 32nd European …, 2024 - ieeexplore.ieee.org
The performance of deep learning models depends significantly on their capacity to encode
input features efficiently and decode them into meaningful outputs. Better input and output …

SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in Hubert

CJ Cho, A Mohamed, SW Li, AW Black… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Data-driven unit discovery in self-supervised learning (SSL) of speech has embarked on a
new era of spoken language processing. Yet, the discovered units often remain in phonetic …

Multimodal segmentation for vocal tract modeling

R Jain, B Yu, P Wu, T Prabhune… - arxiv preprint arxiv …, 2024 - arxiv.org
Accurate modeling of the vocal tract is necessary to construct articulatory representations for
interpretable speech processing and linguistics. However, vocal tract modeling is …

SD-HuBERT: Self-Distillation Induces Syllabic Organization in HuBERT

CJ Cho, A Mohamed, SW Li, AW Black… - arxiv preprint arxiv …, 2023 - arxiv.org
Data-driven unit discovery in self-supervised learning (SSL) of speech has embarked on a
new era of spoken language processing. Yet, the discovered units often remain in phonetic …

Self-Supervised Models of Speech Infer Universal Articulatory Kinematics

CJ Cho, A Mohamed, AW Black… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Self-Supervised Learning (SSL) based models of speech have shown remarkable
performance on a range of downstream tasks. These state-of-the-art models have remained …