Language modeling for code-mixing: The role of linguistic theory based synthetic data

A Pratapa, G Bhat, M Choudhury… - Proceedings of the …, 2018 - aclanthology.org
Training language models for Code-mixed (CM) language is known to be a difficult problem
because of lack of data compounded by the increased confusability due to the presence of …

Modeling code-switch languages using bilingual parallel corpus

G Lee, H Li - Proceedings of the 58th Annual Meeting of the …, 2020 - aclanthology.org
Abstract Language modeling is the technique to estimate the probability of a sequence of
words. A bilingual language model is expected to model the sequential dependency for …

Code-switched language models using dual RNNs and same-source pretraining

S Garg, T Parekh, P Jyothi - arxiv preprint arxiv:1809.01962, 2018 - arxiv.org
This work focuses on building language models (LMs) for code-switched text. We propose
two techniques that significantly improve these LMs: 1) A novel recurrent neural network unit …

[PDF][PDF] Curriculum design for code-switching: Experiments with language identification and language modeling with deep neural networks

M Choudhury, K Bali, S Sitaram… - Proceedings of the 14th …, 2017 - aclanthology.org
Curriculum learning strategies are known to improve the accuracy, robustness and
convergence rate for various language learning tasks using deep architectures (Bengio et …

Dual language models for code switched speech recognition

S Garg, T Parekh, P Jyothi - arxiv preprint arxiv:1711.01048, 2017 - arxiv.org
In this work, we present a simple and elegant approach to language modeling for bilingual
code-switched text. Since code-switching is a blend of two or more different languages, a …

Improving N-gram language modeling for code-switching speech recognition

Z Zeng, H Xu, TY Chong, ES Chng… - 2017 Asia-Pacific Signal …, 2017 - ieeexplore.ieee.org
Code-switching language modeling is challenging due to statistics of each individual
language, as well as statistics of cross-lingual language are insufficient. To compensate for …

An improved framework for recognizing highly imbalanced bilingual code-switched lectures with cross-language acoustic modeling and frame-level language …

CF Yeh, LS Lee - IEEE/ACM Transactions on Audio, Speech …, 2015 - ieeexplore.ieee.org
This paper considers the recognition of a widely observed type of bilingual code-switched
speech: the speaker speaks primarily the host language (usually his native language), but …