Self-supervised speech representation learning: A review

A Mohamed, H Lee, L Borgholt… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Although supervised deep learning has revolutionized speech and audio processing, it has
necessitated the building of specialist models for individual tasks and application scenarios …

[BOOK][B] Voice quality: The laryngeal articulator model

JH Esling, SR Moisik, A Benner, L Crevier-Buchman - 2019 - books.google.com
" The first description of voice quality production in 40 years, this book provides a new
framework for its study: The Laryngeal Articulator Model. Informed by instrumental …

[PDF][PDF] SMASH: a tool for articulatory data processing and analysis.

JR Green, J Wang, DL Wilson - Interspeech, 2013 - researchgate.net
Recent innovations in 3D motion capture technology such as electromagnetic articulography
(EMA) are providing unprecedented access to the intricate movements of the articulators …

Multi-view CCA-based acoustic features for phonetic recognition across speakers and domains

R Arora, K Livescu - 2013 IEEE International Conference on …, 2013 - ieeexplore.ieee.org
Canonical correlation analysis (CCA) and kernel CCA can be used for unsupervised
learning of acoustic features when a second view (eg, articulatory measurements) is …

Multimodal dataset of real-time 2D and static 3D MRI of healthy French speakers

K Isaieva, Y Laprie, J Leclère, IK Douros, J Felblinger… - Scientific Data, 2021 - nature.com
The study of articulatory gestures has a wide spectrum of applications, notably in speech
production and recognition. Sets of phonemes, as well as their articulation, are language …

Paralinguistic mechanisms of production in human “beatboxing”: A real-time magnetic resonance imaging study

M Proctor, E Bresch, D Byrd, K Nayak… - The Journal of the …, 2013 - pubs.aip.org
Real-time Magnetic Resonance Imaging (rtMRI) was used to examine mechanisms of sound
production by an American male beatbox artist. rtMRI was found to be a useful modality with …

A study of laryngeal gestures in Mandarin citation tones using simultaneous laryngoscopy and laryngeal ultrasound (SLLUS)

SR Moisik, H Lin, JH Esling - Journal of the International Phonetic …, 2014 - cambridge.org
In this work, Mandarin tone production is examined using simultaneous laryngoscopy and
laryngeal ultrasound (SLLUS). Laryngoscopy is used to obtain information about laryngeal …

The epilarynx in speech

S Moisik - 2013 - dspace.library.uvic.ca
This dissertation examines the phonetic and phonological functioning of the supraglottal part
of the larynx, the epilarynx, from an articulatory-physiological perspective. The central thesis …

Interspeaker variability in hard palate morphology and vowel production

Purpose Differences in vocal tract morphology have the potential to explain interspeaker
variability in speech production. The potential acoustic impact of hard palate shape was …

Data driven articulatory synthesis with deep neural networks

S Aryal, R Gutierrez-Osuna - Computer Speech & Language, 2016 - Elsevier
The conventional approach for data-driven articulatory synthesis consists of modeling the
joint acoustic-articulatory distribution with a Gaussian mixture model (GMM), followed by a …