A high-performance neuroprosthesis for speech decoding and avatar control
Speech neuroprostheses have the potential to restore communication to people living with
paralysis, but naturalistic speed and expressivity are elusive. Here we use high-density …
paralysis, but naturalistic speed and expressivity are elusive. Here we use high-density …
The speech neuroprosthesis
Loss of speech after paralysis is devastating, but circumventing motor-pathway injury by
directly decoding speech from intact cortical activity has the potential to restore natural …
directly decoding speech from intact cortical activity has the potential to restore natural …
Deep Speech Synthesis from MRI-Based Articulatory Representations
In this paper, we study articulatory synthesis, a speech synthesis method using human vocal
tract information that offers a way to develop efficient, generalizable and interpretable …
tract information that offers a way to develop efficient, generalizable and interpretable …
Neural latent aligner: cross-trial alignment for learning representations of complex, naturalistic neural data
Understanding the neural implementation of complex human behaviors is one of the major
goals in neuroscience. To this end, it is crucial to find a true representation of the neural …
goals in neuroscience. To this end, it is crucial to find a true representation of the neural …
Slim: Style-linguistics mismatch model for generalized audio deepfake detection
Audio deepfake detection (ADD) is crucial to combat the misuse of speech synthesized from
generative AI models. Existing ADD models suffer from generalization issues, with a large …
generative AI models. Existing ADD models suffer from generalization issues, with a large …
Improving speech inversion through self-supervised embeddings and enhanced tract variables
The performance of deep learning models depends significantly on their capacity to encode
input features efficiently and decode them into meaningful outputs. Better input and output …
input features efficiently and decode them into meaningful outputs. Better input and output …
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in Hubert
Data-driven unit discovery in self-supervised learning (SSL) of speech has embarked on a
new era of spoken language processing. Yet, the discovered units often remain in phonetic …
new era of spoken language processing. Yet, the discovered units often remain in phonetic …
Multimodal segmentation for vocal tract modeling
Accurate modeling of the vocal tract is necessary to construct articulatory representations for
interpretable speech processing and linguistics. However, vocal tract modeling is …
interpretable speech processing and linguistics. However, vocal tract modeling is …
SD-HuBERT: Self-Distillation Induces Syllabic Organization in HuBERT
Data-driven unit discovery in self-supervised learning (SSL) of speech has embarked on a
new era of spoken language processing. Yet, the discovered units often remain in phonetic …
new era of spoken language processing. Yet, the discovered units often remain in phonetic …
Self-Supervised Models of Speech Infer Universal Articulatory Kinematics
Self-Supervised Learning (SSL) based models of speech have shown remarkable
performance on a range of downstream tasks. These state-of-the-art models have remained …
performance on a range of downstream tasks. These state-of-the-art models have remained …