[PDF][PDF] Recent advances in end-to-end automatic speech recognition
J Li - APSIPA Transactions on Signal and Information …, 2022 - nowpublishers.com
Recently, the speech community is seeing a significant trend of moving from deep neural
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
network based hybrid modeling to end-to-end (E2E) modeling for automatic speech …
A review on big data based on deep neural network approaches
Big data analytics has become a significant trend for many businesses as a result of the
daily acquisition of enormous volumes of data. This information has been gathered because …
daily acquisition of enormous volumes of data. This information has been gathered because …
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation
What does it take to create the Babel Fish, a tool that can help individuals translate speech
between any two languages? While recent breakthroughs in text-based models have …
between any two languages? While recent breakthroughs in text-based models have …
The multilingual tedx corpus for speech recognition and translation
We present the Multilingual TEDx corpus, built to support speech recognition (ASR) and
speech translation (ST) research across many non-English source languages. The corpus is …
speech translation (ST) research across many non-English source languages. The corpus is …
ESPnet-ST: All-in-one speech translation toolkit
We present ESPnet-ST, which is designed for the quick development of speech-to-speech
translation systems in a single framework. ESPnet-ST is a new project inside end-to-end …
translation systems in a single framework. ESPnet-ST is a new project inside end-to-end …
Cascade versus direct speech translation: Do the differences still make a difference?
Five years after the first published proofs of concept, direct approaches to speech translation
(ST) are now competing with traditional cascade solutions. In light of this steady progress …
(ST) are now competing with traditional cascade solutions. In light of this steady progress …
Improving speech translation by understanding and learning from the auxiliary text translation task
Pretraining and multitask learning are widely used to improve the speech to text translation
performance. In this study, we are interested in training a speech to text translation model …
performance. In this study, we are interested in training a speech to text translation model …
Learning shared semantic space for speech-to-text translation
Having numerous potential applications and great impact, end-to-end speech translation
(ST) has long been treated as an independent task, failing to fully draw strength from the …
(ST) has long been treated as an independent task, failing to fully draw strength from the …
Revisiting end-to-end speech-to-text translation from scratch
Abstract End-to-end (E2E) speech-to-text translation (ST) often depends on pretraining its
encoder and/or decoder using source transcripts via speech recognition or text translation …
encoder and/or decoder using source transcripts via speech recognition or text translation …
Speech translation and the end-to-end promise: Taking stock of where we are
M Sperber, M Paulik - arxiv preprint arxiv:2004.06358, 2020 - arxiv.org
Over its three decade history, speech translation has experienced several shifts in its
primary research themes; moving from loosely coupled cascades of speech recognition and …
primary research themes; moving from loosely coupled cascades of speech recognition and …