Massively multilingual adversarial speech recognition

O Adams, M Wiesner, S Watanabe… - arxiv preprint arxiv …, 2019 - arxiv.org
We report on adaptation of multilingual end-to-end speech recognition models trained on as
many as 100 languages. Our findings shed light on the relative importance of similarity …

[PDF][PDF] Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration.

S Khare, AR Mittal, A Diwan, S Sarawagi, P Jyothi… - Interspeech, 2021 - isca-archive.org
Cross-lingual transfer of knowledge from high-resource languages to low-resource
languages is an important research problem in automatic speech recognition (ASR). We …

Effectiveness of mining audio and text pairs from public data for improving ASR systems for low-resource languages

K Bhogale, A Raman, T Javed… - Icassp 2023-2023 …, 2023 - ieeexplore.ieee.org
Collecting labelled datasets for speech recognition systems for low-resource languages on
a diverse set of domains and speakers is expensive. In this work, we demonstrate an …

Automatic speech recognition for supporting endangered language documentation

E Prud'hommeaux, R Jimerson, R Hatcher… - 2021 - scholarspace.manoa.hawaii.edu
Generating accurate word-level transcripts of recorded speech for language documentation
is difficult and time-consuming, even for skilled speakers of the target language. Automatic …

[HTML][HTML] Crossing language identification: Multilingual ASR framework based on semantic dataset creation & Wav2Vec 2.0

OH Anidjar, R Yozevitch, N Bigon, N Abdalla… - Machine Learning with …, 2023 - Elsevier
This study proposes an innovative methodology to enhance the performance of multilingual
Automatic Speech Recognition (ASR) systems by capitalizing on the high semantic similarity …

Cross-lingual adaptation of a CTC-based multilingual acoustic model

S Tong, PN Garner, H Bourlard - Speech Communication, 2018 - Elsevier
Abstract Multilingual models for Automatic Speech Recognition (ASR) are attractive as they
have been shown to benefit from more training data, and better lend themselves to …

Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the “Speaking rosetta” JSALT 2017 workshop

O Scharenborg, L Besacier, A Black… - … , Speech and Signal …, 2018 - ieeexplore.ieee.org
We summarize the accomplishments of a multi-disciplinary workshop exploring the
computational and scientific issues surrounding the discovery of linguistic units (subwords …

Development of speech recognition systems in emergency call centers

A Valizada, N Akhundova, S Rustamov - Symmetry, 2021 - mdpi.com
In this paper, various methodologies of acoustic and language models, as well as labeling
methods for automatic speech recognition for spoken dialogues in emergency call centers …

User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis

O Adams, B Galliot, G Wisniewski… - arxiv preprint arxiv …, 2020 - arxiv.org
This paper reports on progress integrating the speech recognition toolkit ESPnet into Elpis, a
web front-end originally designed to provide access to the Kaldi automatic speech …

[PDF][PDF] Image2speech: Automatically generating audio descriptions of images

M Hasegawa-Johnson, A Black, L Ondel… - Casablanca …, 2017 - researchgate.net
This paper proposes a new task for artificial intelligence. The image2speech task generates
a spoken description of an image. We present baseline experiments in which the neural net …