Enhancing code-switching speech recognition with interactive language biases

H Liu, LP Garcia, X Zhang, AWH Khong… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org
Languages usually switch within a multilingual speech signal, especially in a bilingual
society. This phenomenon is referred to as code-switching (CS), making automatic speech …

Reducing language confusion for code-switching speech recognition with token-level language diarization

H Liu, H Xu, LP Garcia, AWH Khong… - ICASSP 2023-2023 …, 2023 - ieeexplore.ieee.org
Code-switching (CS) occurs when languages switch within a speech signal and leads to
language confusion for automatic speech recognition (ASR). We address the problem of …

Aligning speech to languages to enhance code-switching speech recognition

H Liu, X Zhang, H Zhang, LP Garcia… - arxiv preprint arxiv …, 2024 - arxiv.org
Code-switching (CS) refers to the switching of languages within a speech signal and results
in language confusion for automatic speech recognition (ASR). To address language …

MERLIon CCS Challenge: A English-Mandarin code-switching child-directed speech corpus for language identification and diarization

VYH Chua, H Liu, LPG Perera, FT Woon… - arxiv preprint arxiv …, 2023 - arxiv.org
To enhance the reliability and robustness of language identification (LID) and language
diarization (LD) systems for heterogeneous populations and scenarios, there is a need for …

Cross-corpora spoken language identification with domain diversification and generalization

S Dey, M Sahidullah, G Saha - Computer Speech & Language, 2023 - Elsevier
This work addresses the cross-corpora generalization issue for the low-resourced spoken
language identification (LID) problem. We have conducted the experiments in the context of …

Towards trustworthy phoneme boundary detection with autoregressive model and improved evaluation metric

H Kim, HS Choi - … 2023-2023 IEEE International Conference on …, 2023 - ieeexplore.ieee.org
Phoneme boundary detection has been studied due to its central role in various speech
applications. In this work, we point out that this task needs to be addressed not only by …

Comparison of different neural network architectures for spoken language identification

T Bazazo, M Zeineldeen, C Plahl… - … 15th ITG conference, 2023 - ieeexplore.ieee.org
This paper compares different neural network based architectures on the spoken language
identification task. To our best knowledge such a comparison of different models on the …

A Compact End-to-End Model with Local and Global Context for Spoken Language Identification

F Jia, NR Koluguri, J Balam, B Ginsburg - arxiv preprint arxiv:2210.15781, 2022 - arxiv.org
We introduce TitaNet-LID, a compact end-to-end neural network for Spoken Language
Identification (LID) that is based on the ContextNet architecture. TitaNet-LID employs 1D …

[PDF][PDF] Self-supervised learning representation based accent recognition with persistent accent memory

R Li, Z **e, H Xu, Y Peng, H Liu, H Huang… - Proceedings of the …, 2023 - isca-archive.org
Accent recognition (AR) is challenging due to the lack of training data as well as the accents
are entangled with speakers and regional characteristics. This paper aims to improve AR …

Investigating model performance in language identification: beyond simple error statistics

SJ Styles, VYH Chua, FT Woon, H Liu… - arxiv preprint arxiv …, 2023 - arxiv.org
Language development experts need tools that can automatically identify languages from
fluent, conversational speech, and provide reliable estimates of usage rates at the level of …