A high-performance speech neuroprosthesis
Speech brain–computer interfaces (BCIs) have the potential to restore rapid communication
to people with paralysis by decoding neural activity evoked by attempted speech into text, or …
to people with paralysis by decoding neural activity evoked by attempted speech into text, or …
High-performance brain-to-text communication via handwriting
Brain–computer interfaces (BCIs) can restore communication to people who have lost the
ability to move or speak. So far, a major focus of BCI research has been on restoring gross …
ability to move or speak. So far, a major focus of BCI research has been on restoring gross …
Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces
This paper presents the machine learning architecture of the Snips Voice Platform, a
software solution to perform Spoken Language Understanding on microprocessors typical of …
software solution to perform Spoken Language Understanding on microprocessors typical of …
[LLIBRE][B] Deep learning for NLP and speech recognition
With the widespread adoption of deep learning, natural language processing (NLP), and
speech applications in various domains such as finance, healthcare, and government and …
speech applications in various domains such as finance, healthcare, and government and …
[PDF][PDF] Purely sequence-trained neural networks for ASR based on lattice-free MMI.
In this paper we describe a method to perform sequencediscriminative training of neural
network acoustic models without the need for frame-level cross-entropy pre-training. We use …
network acoustic models without the need for frame-level cross-entropy pre-training. We use …
[PDF][PDF] Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration
T Nakatani - proc. INTERSPEECH, 2019 - isca-archive.org
The state-of-the-art neural network architecture named Transformer has been used
successfully for many sequence-tosequence transformation tasks. The advantage of this …
successfully for many sequence-tosequence transformation tasks. The advantage of this …
The elements of differentiable programming
Artificial intelligence has recently experienced remarkable advances, fueled by large
models, vast datasets, accelerated hardware, and, last but not least, the transformative …
models, vast datasets, accelerated hardware, and, last but not least, the transformative …
Hybrid autoregressive transducer (hat)
E Variani, D Rybach, C Allauzen… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This paper proposes and evaluates the hybrid autoregressive transducer (HAT) model, a
time-synchronous encoder-decoder model that preserves the modularity of conventional …
time-synchronous encoder-decoder model that preserves the modularity of conventional …
Personalized speech recognition on mobile devices
We describe a large vocabulary speech recognition system that is accurate, has low latency,
and yet has a small enough memory and computational footprint to run faster than real-time …
and yet has a small enough memory and computational footprint to run faster than real-time …
A comprehensive survey of automated audio captioning
Automated audio captioning, a task that mimics human perception as well as innovatively
links audio processing and natural language processing, has overseen much progress over …
links audio processing and natural language processing, has overseen much progress over …