محقق Google

Salm: Speech-augmented language model with in-context learning for speech recognition and translation‏

Z Chen, H Huang, A Andrusenko… - ICASSP 2024-2024 …, 2024‏ - ieeexplore.ieee.org‏

We present a novel Speech Augmented Language Model (SALM) with multitask and in-
context learning capabilities. SALM comprises a frozen text LLM, a audio encoder, a …‏

ذخیره ارجاع بیان شده در 37 یافته مقاله‌های مربوط تمام نسخه‌های 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Contextualized end-to-end automatic speech recognition with intermediate biasing loss‏

M Shakeel, Y Sudo, Y Peng, S Watanabe - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Contextualized end-to-end automatic speech recognition has been an active research area,
with recent efforts focusing on the implicit learning of contextual phrases based on the final …‏

ذخیره ارجاع بیان شده در 3 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Phoneme-aware encoding for prefix-tree-based contextual ASR‏

H Futami, E Tsunoo, Y Kashiwagi… - ICASSP 2024-2024 …, 2024‏ - ieeexplore.ieee.org‏

In speech recognition applications, it is important to recognize context-specific rare words,
such as proper nouns. Tree-constrained Pointer Generator (TCPGen) has shown promise …‏

ذخیره ارجاع بیان شده در 5 یافته مقاله‌های مربوط تمام نسخه‌های 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Adapting OpenAI's Whisper for speech recognition on code-switch mandarin-english seame and asru2019 datasets‏

Y Yang, Y Peng, H Huang, ES Chng… - 2024 Asia Pacific …, 2024‏ - ieeexplore.ieee.org‏

This paper reports on SOTA results achieved using openAI's Whisper model with adaptation
on different adaptation corpus sizes for two established code-switch Mandarin/English …‏

ذخیره ارجاع بیان شده در 5 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Keyword-guided adaptation of automatic speech recognition‏

A Shamsian, A Navon, N Glazer, G Hetz… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Automatic Speech Recognition (ASR) technology has made significant progress in recent
years, providing accurate transcription across various domains. However, some challenges …‏

ذخیره ارجاع بیان شده در 2 یافته مقاله‌های مربوط تمام نسخه‌های 7 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text‏

J Li, Y Pu, Q Sun, WQ Zhang - arxiv preprint arxiv:2408.05554, 2024‏ - arxiv.org‏

Whisper and other large-scale automatic speech recognition models have made significant
progress in performance. However, their performance on many low-resource languages …‏

ذخیره ارجاع بیان شده در 2 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mai Ho'om\= auna i ka'Ai: Language Models Improve Automatic Speech Recognition in Hawaiian‏

K Chaparala, G Zarrella, BT Fischer, L Kimura… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

In this paper we address the challenge of improving Automatic Speech Recognition (ASR)
for a low-resource language, Hawaiian, by incorporating large amounts of independent text …‏

ذخیره ارجاع بیان شده در 1 یافته مقاله‌های مربوط تمام نسخه‌های 7 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Speech-enriched memory for inference-time adaptation of asr models to word dictionaries‏

A Mittal, S Sarawagi, P Jyothi, G Saon… - Proceedings of the …, 2023‏ - aclanthology.org‏

Despite the impressive performance of ASR models on mainstream benchmarks, their
performance on rare words is unsatisfactory. In enterprise settings, often a focused list of …‏

ذخیره ارجاع بیان شده در 2 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] Contextual Biasing Speech Recognition in Speech-enhanced Large Language Model‏

X Gong, A Lv, Z Wang, Y Qian - Proc. Interspeech. ISCA, 2024‏ - isca-archive.org‏

Recently, the rapid advancements in audio-and speechenhanced large language models
(SpeechLLMs), such as Qwen-Audio and SALMONN, have significantly propelled automatic …‏

ذخیره ارجاع بیان شده در 2 یافته مقاله‌های مربوط تمام نسخه‌های 2 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Enhancing quantised end-to-end asr models via personalisation‏

Q Zhao, G Sun, C Zhang, M Xu… - ICASSP 2024-2024 …, 2024‏ - ieeexplore.ieee.org‏

Recent end-to-end automatic speech recognition (ASR) models have become increasingly
larger, making them particularly challenging to be deployed on resource-constrained …‏

ذخیره ارجاع بیان شده در 2 یافته مقاله‌های مربوط تمام نسخه‌های 3

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Can contextual biasing remain effective with Whisper and GPT-2?

Salm: Speech-augmented language model with in-context learning for speech recognition and translation‏

Contextualized end-to-end automatic speech recognition with intermediate biasing loss‏

Phoneme-aware encoding for prefix-tree-based contextual ASR‏

Adapting OpenAI's Whisper for speech recognition on code-switch mandarin-english seame and asru2019 datasets‏

Keyword-guided adaptation of automatic speech recognition‏

Improving Whisper's Recognition Performance for Under-Represented Language Kazakh Leveraging Unpaired Speech and Text‏

Mai Ho'om\= auna i ka'Ai: Language Models Improve Automatic Speech Recognition in Hawaiian‏

Speech-enriched memory for inference-time adaptation of asr models to word dictionaries‏

[PDF][PDF] Contextual Biasing Speech Recognition in Speech-enhanced Large Language Model‏

Enhancing quantised end-to-end asr models via personalisation‏