Robust speech recognition via large-scale weak supervision

A Radford, JW Kim, T Xu, G Brockman… - International …, 2023 - proceedings.mlr.press
We study the capabilities of speech processing systems trained simply to predict large
amounts of transcripts of audio on the internet. When scaled to 680,000 hours of multilingual …

Language identification: A tutorial

E Ambikairajah, H Li, L Wang, B Yin… - IEEE Circuits and …, 2011 - ieeexplore.ieee.org
This tutorial presents an overview of the progression of spoken language identification (LID)
systems and current developments. The introduction provides a background on automatic …

Google's multilingual neural machine translation system: Enabling zero-shot translation

M Johnson, M Schuster, QV Le, M Krikun… - Transactions of the …, 2017 - direct.mit.edu
We propose a simple solution to use a single Neural Machine Translation (NMT) model to
translate between multiple languages. Our solution requires no changes to the model …

[BOG][B] The conversational interface

MF McTear, Z Callejas, D Griol - 2016 - Springer
When we first started planning to write a book on how people would be able to talk in a
natural way to their smartphones, devices and robots, we could not have anticipated that the …

Automatic speech recognition for under-resourced languages: A survey

L Besacier, E Barnard, A Karpov, T Schultz - Speech communication, 2014 - Elsevier
Speech processing for under-resourced languages is an active field of research, which has
experienced significant progress during the past decade. We propose, in this paper, a …

A survey on automatic speech recognition systems for Portuguese language and its variations

TA de Lima, M Da Costa-Abreu - Computer Speech & Language, 2020 - Elsevier
Communication has been an essential part of being human and living in society. There are
several different languages and variations of them, so you can speak English in one place …

Brain-to-text: decoding spoken phrases from phone representations in the brain

C Herff, D Heger, A De Pesters, D Telaar… - Frontiers in …, 2015 - frontiersin.org
It has long been speculated whether communication between humans and machines based
on natural speech related cortical activity is possible. Over the past decade, studies have …

Bytes are all you need: End-to-end multilingual speech recognition and synthesis with bytes

B Li, Y Zhang, T Sainath, Y Wu… - ICASSP 2019-2019 IEEE …, 2019 - ieeexplore.ieee.org
We present two end-to-end models: Audio-to-Byte (A2B) and Byte-to-Audio (B2A), for
multilingual speech recognition and synthesis. Prior work has predominantly used …

Community participatory research with deaf sign language users to identify health inequities

S Barnett, JD Klein, RQ Pollard Jr… - … journal of public …, 2011 - ajph.aphapublications.org
Deaf people who use American Sign Language (ASL) are medically underserved and often
excluded from health research and surveillance. We used a community participatory …

Globalphone: A multilingual text & speech database in 20 languages

T Schultz, NT Vu, T Schlippe - 2013 IEEE International …, 2013 - ieeexplore.ieee.org
This paper describes the advances in the multilingual text and speech database
GlobalPhone, a multilingual database of high-quality read speech with corresponding …