Language modeling for code-mixing: The role of linguistic theory based synthetic data

A Pratapa, G Bhat, M Choudhury… - Proceedings of the …, 2018 - aclanthology.org
Training language models for Code-mixed (CM) language is known to be a difficult problem
because of lack of data compounded by the increased confusability due to the presence of …

A review of the Mandarin-English code-switching corpus: SEAME

G Lee, TN Ho, ES Chng, H Li - 2017 International Conference …, 2017 - ieeexplore.ieee.org
In this paper, we report the development of the South East Asia Mandarin-English (SEAME)
corpus, including 63 hours of transcribed spontaneous Mandarin-English code-switching …

Lstm language models for lvcsr in first-pass decoding and lattice-rescoring

E Beck, W Zhou, R Schlüter, H Ney - arxiv preprint arxiv:1907.01030, 2019 - arxiv.org
LSTM based language models are an important part of modern LVCSR systems as they
significantly improve performance over traditional backoff language models. Incorporating …

[PDF][PDF] Curriculum design for code-switching: Experiments with language identification and language modeling with deep neural networks

M Choudhury, K Bali, S Sitaram… - Proceedings of the 14th …, 2017 - aclanthology.org
Curriculum learning strategies are known to improve the accuracy, robustness and
convergence rate for various language learning tasks using deep architectures (Bengio et …

Semi-supervised adaptation of assistant based speech recognition models for different approach areas

M Kleinert, H Helmke, G Siol, H Ehr… - 2018 IEEE/AIAA 37th …, 2018 - ieeexplore.ieee.org
Air Navigation Service Providers (ANSPs) replace paper flight strips through different digital
solutions. The instructed commands from an air traffic controller (ATCos) are then available …

Improvements to n-gram language model using text generated from neural language model

M Suzuki, N Itoh, T Nagano, G Kurata… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Although neural language models have emerged, n-gram language models are still used for
many speech recognition tasks. This paper proposes four methods to improve n-gram …

[PDF][PDF] Approximated and Domain-Adapted LSTM Language Models for First-Pass Decoding in Speech Recognition.

M Singh, Y Oualil, D Klakow - INTERSPEECH, 2017 - isca-archive.org
Abstract Traditionally, short-range Language Models (LMs) like the conventional n-gram
models have been used for language model adaptation. Recent work has improved …

Improving N-gram language models with pre-trained deep transformer

Y Wang, H Huang, Z Liu, Y Pang, Y Wang… - arxiv preprint arxiv …, 2019 - arxiv.org
Although n-gram language models (LMs) have been outperformed by the state-of-the-art
neural LMs, they are still widely used in speech recognition due to its high efficiency in …

[PDF][PDF] Iterative Learning of Speech Recognition Models for Air Traffic Control.

A Srinivasamurthy, P Motlicek, M Singh, Y Oualil… - …, 2018 - publications.idiap.ch
Abstract Automatic Speech Recognition (ASR) has recently proved to be a useful tool to
reduce the workload of air traffic controllers leading to significant gains in operational …

On the N-gram Approximation of Pre-trained Language Models

A Krishnan, J Alabi, D Klakow - arxiv preprint arxiv:2306.06892, 2023 - arxiv.org
Large pre-trained language models (PLMs) have shown remarkable performance across
various natural language understanding (NLU) tasks, particularly in low-resource settings …