Contextualized end-to-end speech recognition with contextual phrase prediction network

K Huang, A Zhang, Z Yang, P Guo, B Mu, T Xu… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Contextual information plays a crucial role in speech recognition technologies and
incorporating it into the end-to-end speech recognition models has drawn immense interest …

Can contextual biasing remain effective with Whisper and GPT-2?

G Sun, X Zheng, C Zhang, PC Woodland - arxiv preprint arxiv:2306.01942, 2023‏ - arxiv.org
End-to-end automatic speech recognition (ASR) and large language models, such as
Whisper and GPT-2, have recently been scaled to use vast amounts of training data. Despite …

Towards contextual spelling correction for customization of end-to-end speech recognition systems

X Wang, Y Liu, J Li, V Miljanic, S Zhao… - … /ACM transactions on …, 2022‏ - ieeexplore.ieee.org
Contextual biasing is an important and challenging task for end-to-end automatic speech
recognition (ASR) systems, which aims to achieve better recognition performance by biasing …

Improving contextual recognition of rare words with an alternate spelling prediction model

JD Fox, N Delworth - arxiv preprint arxiv:2209.01250, 2022‏ - arxiv.org
Contextual ASR, which takes a list of bias terms as input along with audio, has drawn recent
interest as ASR use becomes more widespread. We are releasing contextual biasing lists to …

Slot-triggered contextual biasing for personalized speech recognition using neural transducers

S Tong, P Harding, S Wiesler - ICASSP 2023-2023 IEEE …, 2023‏ - ieeexplore.ieee.org
End-to-end (E2E) automatic speech recognition (ASR) models have been found to perform
well on general transcription tasks but often fail to correctly recognize words that occur …

Adaptive contextual biasing for transducer based streaming speech recognition

T Xu, Z Yang, K Huang, P Guo, A Zhang, B Li… - arxiv preprint arxiv …, 2023‏ - arxiv.org
By incorporating additional contextual information, deep biasing methods have emerged as
a promising solution for speech recognition of personalized words. However, for real-world …

PromptASR for contextualized ASR with controllable style

X Yang, W Kang, Z Yao, Y Yang, L Guo… - ICASSP 2024-2024 …, 2024‏ - ieeexplore.ieee.org
Prompts are crucial to large language models as they provide context information such as
topic or logical relationships. Inspired by this, we propose PromptASR, a framework that …

Minimising biasing word errors for contextual ASR with the tree-constrained pointer generator

G Sun, C Zhang, PC Woodland - IEEE/ACM Transactions on …, 2022‏ - ieeexplore.ieee.org
Contextual knowledge is essential for reducing speech recognition errors on high-valued
long-tail words. This paper proposes a novel tree-constrained pointer generator (TCPGen) …

[HTML][HTML] Knowledge-aware audio-grounded generative slot filling for limited annotated data

G Sun, C Zhang, I Vulić, P Budzianowski… - Computer Speech & …, 2025‏ - Elsevier
Manually annotating fine-grained slot-value labels for task-oriented dialogue (ToD) systems
is an expensive and time-consuming endeavour. This motivates research into slot-filling …

Contextualized end-to-end automatic speech recognition with intermediate biasing loss

M Shakeel, Y Sudo, Y Peng, S Watanabe - arxiv preprint arxiv …, 2024‏ - arxiv.org
Contextualized end-to-end automatic speech recognition has been an active research area,
with recent efforts focusing on the implicit learning of contextual phrases based on the final …