Speech synthesis from neural decoding of spoken sentences

GK Anumanchipalli, J Chartier, EF Chang - Nature, 2019 - nature.com
Technology that translates neural activity into speech would be transformative for people
who are unable to communicate as a result of neurological impairments. Decoding speech …

A review of data collection practices using electromagnetic articulography

T Rebernik, J Jacobi, R Jonkers, A Noiray… - Laboratory …, 2021 - research.rug.nl
This paper reviews data collection practices in electromagnetic articulography (EMA)
studies, with a focus on sensor placement. It consists of three parts: in the first part, we …

Statistics in phonetics

S Tavakoli, B Matteo, D Pigoli, E Chodroff… - Annual Review of …, 2024 - annualreviews.org
Phonetics is the scientific field concerned with the study of how speech is produced, heard,
and perceived. It abounds with data, such as acoustic speech recordings, neuroimaging …

[HTML][HTML] Encoding of articulatory kinematic trajectories in human speech sensorimotor cortex

J Chartier, GK Anumanchipalli, K Johnson, EF Chang - Neuron, 2018 - cell.com
When speaking, we dynamically coordinate movements of our jaw, tongue, lips, and larynx.
To investigate the neural mechanisms underlying articulation, we used direct cortical …

u-hubert: Unified mixed-modal speech pretraining and zero-shot transfer to unlabeled modality

WN Hsu, B Shi - Advances in Neural Information Processing …, 2022 - proceedings.neurips.cc
While audio-visual speech models can yield superior performance and robustness
compared to audio-only models, their development and adoption are hindered by the lack of …

The secret source: Incorporating source features to improve acoustic-to-articulatory speech inversion

YM Siriwardena, C Espy-Wilson - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org
In this work, we incorporated acoustically derived source features, aperiodicity, periodicity
and pitch as additional targets to an acoustic-to-articulatory speech inversion (SI) system …

EMG-to-speech: Direct generation of speech from facial electromyographic signals

M Janke, L Diener - IEEE/ACM Transactions on Audio, Speech …, 2017 - ieeexplore.ieee.org
Silent speech interfaces are systems that enable speech communication even when an
acoustic signal is unavailable. Over the last years, public interest in such interfaces has …

Self-supervised asr models and features for dysarthric and elderly speech recognition

S Hu, X **e, M Geng, Z **, J Deng, G Li… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
Self-supervised learning (SSL) based speech foundation models have been applied to a
wide range of ASR tasks. However, their application to dysarthric and elderly speech via …

Evidence of vocal tract articulation in self-supervised learning of speech

CJ Cho, P Wu, A Mohamed… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Recent self-supervised learning (SSL) models have proven to learn rich representations of
speech, which can readily be utilized by diverse downstream tasks. To understand such …

Deep speech synthesis from articulatory representations

P Wu, S Watanabe, L Goldstein, AW Black… - arxiv preprint arxiv …, 2022 - arxiv.org
In the articulatory synthesis task, speech is synthesized from input features containing
information about the physical behavior of the human vocal tract. This task provides a …