[PDF][PDF] Improvements in RWTH LVCSR evaluation systems for Polish, Portuguese, English, urdu, and Arabic.
Abstract In this work, Portuguese, Polish, English, Urdu, and Arabic automatic speech
recognition evaluation systems developed by the RWTH Aachen University are presented …
recognition evaluation systems developed by the RWTH Aachen University are presented …
[PDF][PDF] Feature-rich sub-lexical language models using a maximum entropy approach for German LVCSR.
German is a morphologically rich language having a high degree of word inflections,
derivations and compounding. This leads to high out-of-vocabulary (OOV) rates and poor …
derivations and compounding. This leads to high out-of-vocabulary (OOV) rates and poor …
Factored language modeling for Russian LVCSR
The Russian language is characterized by very flexible word order, which limits the ability of
the standard n-grams to capture important regularities in the data. Moreover, Russian is …
the standard n-grams to capture important regularities in the data. Moreover, Russian is …
Experimenting with factored language model and generalized back-off for Hindi
Abstract Language modeling is a statistical technique to represent the text data in machine
readable format. It finds the probability distribution of sequence of words present in the text …
readable format. It finds the probability distribution of sequence of words present in the text …
Probabilistic modelling of morphologically rich languages
JA Botha - arxiv preprint arxiv:1508.04271, 2015 - arxiv.org
This thesis investigates how the sub-structure of words can be accounted for in probabilistic
models of language. Such models play an important role in natural language processing …
models of language. Such models play an important role in natural language processing …
[PDF][PDF] Experiments towards a better LVCSR System for Tamil
This paper summarizes our latest efforts in the development of a Large Vocabulary
Continuous Speech Recognition (LVCSR) system for Tamil at different levels: pronunciation …
Continuous Speech Recognition (LVCSR) system for Tamil at different levels: pronunciation …
[PDF][PDF] Morpheme level hierarchical pitman-yor class-based language models for LVCSR of morphologically rich languages.
Performing large vocabulary continuous speech recognition (LVCSR) for morphologically
rich languages is considered a challenging task. The morphological richness of such …
rich languages is considered a challenging task. The morphological richness of such …
[PDF][PDF] Morpheme Level Feature-based Language Models for German LVCSR.
One of the challenges for Large Vocabulary Continuous Speech Recognition (LVCSR) of
German is its complex morphology and high level of compounding. It leads to high Out-of …
German is its complex morphology and high level of compounding. It leads to high Out-of …
[PDF][PDF] Investigation on language modelling approaches for open vocabulary speech recognition
B Shaik, M Ali - 2016 - publications.rwth-aachen.de
By definition, words that are not present in a recognition vocabulary are called out-of-
vocabulary (OOV) words. Recognition of unseen or new words is an important feature that is …
vocabulary (OOV) words. Recognition of unseen or new words is an important feature that is …
Speech Recognition in Inflective Languages
G Donaj, Z Kačič, G Donaj, Z Kačič - Language Modeling for Automatic …, 2017 - Springer
In this chapter basic concepts of speech recognition are presented. Acoustic processing,
acoustic modeling and search algorithms are briefly described. A more detailed explanation …
acoustic modeling and search algorithms are briefly described. A more detailed explanation …