Error reduction in speech processing

JP Lilly, RP Thomas, JP Adams - US Patent 9,697,827, 2017 - Google Patents
BACKGROUND Modern speech recognition systems typically include both speech layer and
understanding layer processing to analyze spoken commands or queries provided by a …

Sequence-to-sequence data augmentation for dialogue language understanding

Y Hou, Y Liu, W Che, T Liu - arxiv preprint arxiv:1807.01554, 2018 - arxiv.org
In this paper, we study the problem of data augmentation for language understanding in task-
oriented dialogue system. In contrast to previous work which augments an utterance without …

Hallucinations in neural automatic speech recognition: Identifying errors and hallucinatory models

R Frieske, BE Shi - arxiv preprint arxiv:2401.01572, 2024 - arxiv.org
Hallucinations are a type of output error produced by deep neural networks. While this has
been studied in natural language processing, they have not been researched previously in …

Hallucination of speech recognition errors with sequence to sequence learning

P Serai, V Sunder… - IEEE/ACM Transactions on …, 2022 - ieeexplore.ieee.org
Prior work in this domain has focused on modeling errors at the phonetic level, while using a
lexicon to convert the phones to words, usually accompanied by an FST Language model …

Natural language translation techniques

W Tunstall-Pedoe, RP Stacey, T Ashton… - US Patent …, 2016 - Google Patents
BACKGROUND The manner in which humans interact with computing devices is rapidly
evolving and has reached the point where human users can access services and resources …

Confusion2vec: Towards enriching vector space word representations with representational ambiguities

PG Shivakumar, P Georgiou - PeerJ Computer Science, 2019 - peerj.com
Word vector representations are a crucial part of natural language processing (NLP) and
human computer interaction. In this paper, we propose a novel word vector representation …

Learning from past mistakes: improving automatic speech recognition output via noisy-clean phrase context modeling

PG Shivakumar, H Li, K Knight… - APSIPA Transactions on …, 2019 - cambridge.org
Automatic speech recognition (ASR) systems often make unrecoverable errors due to
subsystem pruning (acoustic, language and pronunciation models); for example, pruning …

[PDF][PDF] Augmenting translation models with simulated acoustic confusions for improved spoken language translation

Y Tsvetkov, F Metze, C Dyer - … of the 14th Conference of the …, 2014 - aclanthology.org
We propose a novel technique for adapting text-based statistical machine translation to deal
with input from automatic speech recognition in spoken language translation tasks. We …

Improving asr output for endangered language documentation

R Jimerson, K Simha, R Ptucha… - The 6th intl. workshop …, 2018 - par.nsf.gov
Documenting endangered languages supports the historical preservation of diverse
cultures. Automatic speech recognition (ASR), while potentially very useful for this task, has …

Adapting machine translation models toward misrecognized speech with text-to-speech pronunciation rules and acoustic confusability

N Ruiz, G Qin, L Will, M Federico - Proceedings of Interspeech 2015, 2015 - cris.fbk.eu
In the spoken language translation pipeline, machine translation systems that are trained
solely on written bitexts are often unable to recover from speech recognition errors due to …