[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
Graph neural networks for contextual ASR with the tree-constrained pointer generator
Incorporating biasing words obtained through contextual knowledge is paramount in
automatic speech recognition (ASR) applications. This paper proposes an innovative …
automatic speech recognition (ASR) applications. This paper proposes an innovative …
[HTML][HTML] Selective biasing with trie-based contextual adapters for personalised speech recognition using neural transducers
Neural transducer ASR models achieve state of the art accuracy on many tasks, however
rare word recognition poses a particular challenge as models often fail to recognise words …
rare word recognition poses a particular challenge as models often fail to recognise words …
Automatic Speech Recognition Design Modeling
K Babu Rao, B Mopuru, M Jawarneh… - Conversational …, 2024 - Wiley Online Library
The term “automatic speech recognition” refers to the procedure by which an auditory signal
of spoken words can be converted into text. Voice recognition is another term that may be …
of spoken words can be converted into text. Voice recognition is another term that may be …
Incorporating Class-based Language Model for Named Entity Recognition in Factorized Neural Transducer
In spite of the excellent strides made by end-to-end (E2E) models in speech recognition in
recent years, named entity recognition is still challenging but critical for semantic …
recent years, named entity recognition is still challenging but critical for semantic …
Integrating Language Models into Direct Speech Translation: An Inference-Time Solution to Control Gender Inflection
When translating words referring to the speaker, speech translation (ST) systems should not
resort to default masculine generics nor rely on potentially misleading vocal traits. Rather …
resort to default masculine generics nor rely on potentially misleading vocal traits. Rather …
Entropy-Based Dynamic Rescoring with Language Model in E2E ASR Systems
Z Gong, D Saito, N Minematsu - Applied Sciences, 2022 - mdpi.com
Language models (LM) have played crucial roles in automatic speech recognition (ASR),
whether as an essential part of a conventional ASR system composed of an acoustic model …
whether as an essential part of a conventional ASR system composed of an acoustic model …
DOC-RAG: ASR Language Model Personalization with Domain-Distributed Co-occurrence Retrieval Augmentation
Abstract We propose DOC-RAG-Domain-distributed Co-occurrence Retrieval Augmentation
for ASR language model personalization aiming to improve the automatic speech …
for ASR language model personalization aiming to improve the automatic speech …