Unsupervised acoustic unit discovery for speech synthesis using discrete latent-variable neural networks
For our submission to the ZeroSpeech 2019 challenge, we apply discrete latent-variable
neural networks to unlabelled speech and use the discovered units for speech synthesis …
neural networks to unlabelled speech and use the discovered units for speech synthesis …
Early phonetic learning without phonetic categories: Insights from large-scale simulations on realistic input
Before they even speak, infants become attuned to the sounds of the language (s) they hear,
processing native phonetic contrasts more easily than nonnative ones. For example …
processing native phonetic contrasts more easily than nonnative ones. For example …
Do self-supervised speech models develop human-like perception biases?
Self-supervised models for speech processing form representational spaces without using
any external labels. Increasingly, they appear to be a feasible way of at least partially …
any external labels. Increasingly, they appear to be a feasible way of at least partially …
How familiar does that sound? Cross-lingual representational similarity analysis of acoustic word embeddings
How do neural networks" perceive" speech sounds from unknown languages? Does the
typological similarity between the model's training language (L1) and an unknown language …
typological similarity between the model's training language (L1) and an unknown language …
Automatic speech recognition in taxi call service systems
In this research, the application of automatic speech recognition system in taxi call services
is investigated. In comparison with traditional query handling systems such as live agents …
is investigated. In comparison with traditional query handling systems such as live agents …
Rediscovering the slavic continuum in representations emerging from neural models of spoken language identification
BM Abdullah, J Kudera, T Avgustinova… - arxiv preprint arxiv …, 2020 - arxiv.org
Deep neural networks have been employed for various spoken language recognition tasks,
including tasks that are multilingual by definition such as spoken language identification. In …
including tasks that are multilingual by definition such as spoken language identification. In …
Comparing unsupervised speech learning directly to human performance in speech perception
We compare the performance of humans (English and French listeners) versus an
unsupervised speech model in a perception experiment (ABX discrimination task). Although …
unsupervised speech model in a perception experiment (ABX discrimination task). Although …