[PDF][PDF] Improvements in RWTH LVCSR evaluation systems for Polish, Portuguese, English, urdu, and Arabic.

MAB Shaik, Z Tüske, MA Tahir, M Nußbaum-Thom… - …, 2015 - isca-archive.org
Abstract In this work, Portuguese, Polish, English, Urdu, and Arabic automatic speech
recognition evaluation systems developed by the RWTH Aachen University are presented …

[PDF][PDF] Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR.

MAB Shaik, AED Mousa, R Schlüter, H Ney - Interspeech, 2013 - academia.edu
German is a morphologically rich language having a high degree of word inflections,
derivations and compounding. This leads to high out-of-vocabulary (OOV) rates and poor …

Factored language modeling for Russian LVCSR

D Vazhenina, K Markov - … & Ubi-Media Computing (iCAST 2013 …, 2013 - ieeexplore.ieee.org
The Russian language is characterized by very flexible word order, which limits the ability of
the standard n-grams to capture important regularities in the data. Moreover, Russian is …

Experimenting with factored language model and generalized back-off for Hindi

AR Babhulgaonkar, SP Sonavane - International Journal of Information …, 2022 - Springer
Abstract Language modeling is a statistical technique to represent the text data in machine
readable format. It finds the probability distribution of sequence of words present in the text …

Probabilistic modelling of morphologically rich languages

JA Botha - arxiv preprint arxiv:1508.04271, 2015 - arxiv.org
This thesis investigates how the sub-structure of words can be accounted for in probabilistic
models of language. Such models play an important role in natural language processing …

[PDF][PDF] Experiments towards a better LVCSR System for Tamil

MJJ Premkumar, NT Vu, T Schultz - Training, 2013 - isca-archive.org
This paper summarizes our latest efforts in the development of a Large Vocabulary
Continuous Speech Recognition (LVCSR) system for Tamil at different levels: pronunciation …

[PDF][PDF] Morpheme level hierarchical pitman-yor class-based language models for LVCSR of morphologically rich languages.

AED Mousa, MAB Shaik… - …, 2013 - www-i6.informatik.rwth-aachen.de
Performing large vocabulary continuous speech recognition (LVCSR) for morphologically
rich languages is considered a challenging task. The morphological richness of such …

[PDF][PDF] Morpheme Level Feature-based Language Models for German LVCSR.

AED Mousa, MAB Shaik, R Schlüter, H Ney - INTERSPEECH, 2012 - isca-archive.org
One of the challenges for Large Vocabulary Continuous Speech Recognition (LVCSR) of
German is its complex morphology and high level of compounding. It leads to high Out-of …

[PDF][PDF] Investigation on language modelling approaches for open vocabulary speech recognition

B Shaik, M Ali - 2016 - publications.rwth-aachen.de
By definition, words that are not present in a recognition vocabulary are called out-of-
vocabulary (OOV) words. Recognition of unseen or new words is an important feature that is …

Speech Recognition in Inflective Languages

G Donaj, Z Kačič, G Donaj, Z Kačič - Language Modeling for Automatic …, 2017 - Springer
In this chapter basic concepts of speech recognition are presented. Acoustic processing,
acoustic modeling and search algorithms are briefly described. A more detailed explanation …