Espnet-slu: Advancing spoken language understanding through espnet
As Automatic Speech Processing (ASR) systems are getting better, there is an increasing
interest of using the ASR output to do downstream Natural Language Processing (NLP) …
interest of using the ASR output to do downstream Natural Language Processing (NLP) …
Cwcl: Cross-modal transfer with continuously weighted contrastive loss
This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-
trained model in one modality is used for representation learning in another domain using …
trained model in one modality is used for representation learning in another domain using …
A study on the integration of pre-trained ssl, asr, lm and slu models for spoken language understanding
Collecting sufficient labeled data for spoken language understanding (SLU) is expensive
and time-consuming. Recent studies achieved promising results by using pre-trained …
and time-consuming. Recent studies achieved promising results by using pre-trained …
Integration of pre-trained networks with continuous token interface for end-to-end spoken language understanding
Most End-to-End (E2E) Spoken Language Understanding (SLU) networks leverage the pre-
trained Automatic Speech Recognition (ASR) networks but still lack the capability to …
trained Automatic Speech Recognition (ASR) networks but still lack the capability to …
Universlu: Universal spoken language understanding for diverse classification and sequence generation tasks with a single network
Recent studies have demonstrated promising outcomes by employing large language
models with multi-tasking capabilities. They utilize prompts to guide the model's behavior …
models with multi-tasking capabilities. They utilize prompts to guide the model's behavior …
Two-pass low latency end-to-end spoken language understanding
End-to-end (E2E) models are becoming increasingly popular for spoken language
understanding (SLU) systems and are beginning to achieve competitive performance to …
understanding (SLU) systems and are beginning to achieve competitive performance to …
Integrating pretrained asr and lm to perform sequence generation for spoken language understanding
There has been an increased interest in the integration of pretrained speech recognition
(ASR) and language models (LM) into the SLU framework. However, prior methods often …
(ASR) and language models (LM) into the SLU framework. However, prior methods often …
On the use of semantically-aligned speech representations for spoken language understanding
In this paper we examine the use of semantically-aligned speech representations for end-to-
end spoken language understanding (SLU). We employ the recently-introduced SAMU …
end spoken language understanding (SLU). We employ the recently-introduced SAMU …
End-to-end spoken language understanding with tree-constrained pointer generator
End-to-end spoken language understanding (SLU) suffers from the long-tail word problem.
This paper exploits contextual biasing, a technique to improve the speech recognition of rare …
This paper exploits contextual biasing, a technique to improve the speech recognition of rare …
[PDF][PDF] Improving Spoken Language Understanding with Cross-Modal Contrastive Learning.
Spoken language understanding (SLU) is conventionally based on pipeline architecture
with error propagation issues. To mitigate this problem, end-to-end (E2E) models are …
with error propagation issues. To mitigate this problem, end-to-end (E2E) models are …