Improving massively multilingual asr with auxiliary ctc objectives

W Chen, B Yan, J Shi, Y Peng, S Maiti… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Multilingual Automatic Speech Recognition (ASR) models have extended the usability of
speech technologies to a wide variety of languages. With how many languages these …

Streaming end-to-end multilingual speech recognition with joint language identification

C Zhang, B Li, T Sainath, T Strohman… - arxiv preprint arxiv …, 2022 - arxiv.org
Language identification is critical for many downstream tasks in automatic speech
recognition (ASR), and is beneficial to integrate into multilingual end-to-end ASR as an …

Aligning speech to languages to enhance code-switching speech recognition

H Liu, X Zhang, H Zhang, LP Garcia… - arxiv preprint arxiv …, 2024 - arxiv.org
Code-switching (CS) refers to the switching of languages within a speech signal and results
in language confusion for automatic speech recognition (ASR). To address language …

Language-specific characteristic assistance for code-switching speech recognition

T Song, Q Xu, M Ge, L Wang, H Shi, Y Lv, Y Lin… - arxiv preprint arxiv …, 2022 - arxiv.org
Dual-encoder structure successfully utilizes two language-specific encoders (LSEs) for code-
switching speech recognition. Because LSEs are initialized by two pre-trained language …

Lae: Language-aware encoder for monolingual and multilingual asr

J Tian, J Yu, C Zhang, C Weng, Y Zou, D Yu - arxiv preprint arxiv …, 2022 - arxiv.org
Despite the rapid progress in automatic speech recognition (ASR) research, recognizing
multilingual speech using a unified ASR system remains highly challenging. Previous works …

Language-routing mixture of experts for multilingual and code-switching speech recognition

W Wang, G Ma, Y Li, B Du - arxiv preprint arxiv:2307.05956, 2023 - arxiv.org
Multilingual speech recognition for both monolingual and code-switching speech is a
challenging task. Recently, based on the Mixture of Experts (MoE), many works have made …

Enhancing code-switching speech recognition with interactive language biases

H Liu, LP Garcia, X Zhang, AWH Khong… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Languages usually switch within a multilingual speech signal, especially in a bilingual
society. This phenomenon is referred to as code-switching (CS), making automatic speech …

Towards zero-shot code-switched speech recognition

B Yan, M Wiesner, O Klejch, P Jyothi… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
In this work, we seek to build effective code-switched (CS) automatic speech recognition
systems (ASR) under the zero-shot set-ting where no transcribed CS speech data is …

Adapting OpenAI's Whisper for speech recognition on code-switch mandarin-english seame and asru2019 datasets

Y Yang, Y Peng, H Huang, ES Chng… - 2024 Asia Pacific …, 2024 - ieeexplore.ieee.org
This paper reports on SOTA results achieved using openAI's Whisper model with adaptation
on different adaptation corpus sizes for two established code-switch Mandarin/English …

Internal language model estimation based language model fusion for cross-domain code-switching speech recognition

Y Peng, Y Liu, J Zhang, H Xu, Y He, H Huang… - arxiv preprint arxiv …, 2022 - arxiv.org
Internal Language Model Estimation (ILME) based language model (LM) fusion has been
shown significantly improved recognition results over conventional shallow fusion in both …