Speech-language pre-training for end-to-end spoken language understanding

Y Qian, X Bianv, Y Shi, N Kanda, L Shen… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
End-to-end (E2E) spoken language understanding (SLU) can infer semantics directly from
speech signal without cascading an automatic speech recognizer (ASR) with a natural …

Integration of pre-trained networks with continuous token interface for end-to-end spoken language understanding

S Seo, D Kwak, B Lee - ICASSP 2022-2022 IEEE International …, 2022 - ieeexplore.ieee.org
Most End-to-End (E2E) Spoken Language Understanding (SLU) networks leverage the pre-
trained Automatic Speech Recognition (ASR) networks but still lack the capability to …

Towards reducing the need for speech training data to build spoken language understanding systems

S Thomas, HKJ Kuo, B Kingsbury… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
The lack of speech data annotated with labels required for spoken language understanding
(SLU) is often a major hurdle in building end-to-end (E2E) systems that can directly process …

Zero-shot end-to-end spoken language understanding via cross-modal selective self-training

J He, J Salazar, K Yao, H Li, J Cai - arxiv preprint arxiv:2305.12793, 2023 - arxiv.org
End-to-end (E2E) spoken language understanding (SLU) is constrained by the cost of
collecting speech-semantics pairs, especially when label domains change. Hence, we …

On the use of semantically-aligned speech representations for spoken language understanding

G Laperrière, V Pelloin, M Rouvier… - 2022 IEEE Spoken …, 2023 - ieeexplore.ieee.org
In this paper we examine the use of semantically-aligned speech representations for end-to-
end spoken language understanding (SLU). We employ the recently-introduced SAMU …

End-to-end model for named entity recognition from speech without paired training data

S Mdhaffar, J Duret, T Parcollet, Y Estève - arxiv preprint arxiv …, 2022 - arxiv.org
Recent works showed that end-to-end neural approaches tend to become very popular for
spoken language understanding (SLU). Through the term end-to-end, one considers the use …

Improving end-to-end speech-to-intent classification with reptile

Y Tian, PJ Gorinski - arxiv preprint arxiv:2008.01994, 2020 - arxiv.org
End-to-end spoken language understanding (SLU) systems have many advantages over
conventional pipeline systems, but collecting in-domain speech data to train an end-to-end …

Exploring transfer learning for end-to-end spoken language understanding

S Rongali, B Liu, L Cai, K Arkoudas, C Su… - Proceedings of the AAAI …, 2021 - ojs.aaai.org
Abstract Voice Assistants such as Alexa, Siri, and Google Assistant typically use a two-stage
Spoken Language Understanding pipeline; first, an Automatic Speech Recognition (ASR) …

Improving end-to-end speech processing by efficient text data utilization with latent synthesis

J Lu, W Huang, N Zheng, X Zeng, YT Yeung… - arxiv preprint arxiv …, 2023 - arxiv.org
Training a high performance end-to-end speech (E2E) processing model requires an
enormous amount of labeled speech data, especially in the era of data-centric artificial …

Leveraging acoustic and linguistic embeddings from pretrained speech and language models for intent classification

B Sharma, M Madhavi, H Li - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
Intent classification is a task in spoken language understanding. An intent classification
system is usually implemented as a pipeline process, with a speech recognition module …