A high-performance speech neuroprosthesis

FR Willett, EM Kunz, C Fan, DT Avansino, GH Wilson… - Nature, 2023 - nature.com
Speech brain–computer interfaces (BCIs) have the potential to restore rapid communication
to people with paralysis by decoding neural activity evoked by attempted speech into text, or …

High-performance brain-to-text communication via handwriting

FR Willett, DT Avansino, LR Hochberg, JM Henderson… - Nature, 2021 - nature.com
Brain–computer interfaces (BCIs) can restore communication to people who have lost the
ability to move or speak. So far, a major focus of BCI research has been on restoring gross …

Snips voice platform: an embedded spoken language understanding system for private-by-design voice interfaces

A Coucke, A Saade, A Ball, T Bluche, A Caulier… - arxiv preprint arxiv …, 2018 - arxiv.org
This paper presents the machine learning architecture of the Snips Voice Platform, a
software solution to perform Spoken Language Understanding on microprocessors typical of …

[LLIBRE][B] Deep learning for NLP and speech recognition

U Kamath, J Liu, J Whitaker - 2019 - Springer
With the widespread adoption of deep learning, natural language processing (NLP), and
speech applications in various domains such as finance, healthcare, and government and …

[PDF][PDF] Purely sequence-trained neural networks for ASR based on lattice-free MMI.

D Povey, V Peddinti, D Galvez, P Ghahremani… - Interspeech, 2016 - isca-archive.org
In this paper we describe a method to perform sequencediscriminative training of neural
network acoustic models without the need for frame-level cross-entropy pre-training. We use …

[PDF][PDF] Improving transformer-based end-to-end speech recognition with connectionist temporal classification and language model integration

T Nakatani - proc. INTERSPEECH, 2019 - isca-archive.org
The state-of-the-art neural network architecture named Transformer has been used
successfully for many sequence-tosequence transformation tasks. The advantage of this …

The elements of differentiable programming

M Blondel, V Roulet - arxiv preprint arxiv:2403.14606, 2024 - arxiv.org
Artificial intelligence has recently experienced remarkable advances, fueled by large
models, vast datasets, accelerated hardware, and, last but not least, the transformative …

Hybrid autoregressive transducer (hat)

E Variani, D Rybach, C Allauzen… - ICASSP 2020-2020 …, 2020 - ieeexplore.ieee.org
This paper proposes and evaluates the hybrid autoregressive transducer (HAT) model, a
time-synchronous encoder-decoder model that preserves the modularity of conventional …

Personalized speech recognition on mobile devices

I McGraw, R Prabhavalkar, R Alvarez… - … , Speech and Signal …, 2016 - ieeexplore.ieee.org
We describe a large vocabulary speech recognition system that is accurate, has low latency,
and yet has a small enough memory and computational footprint to run faster than real-time …

A comprehensive survey of automated audio captioning

X Xu, M Wu, K Yu - arxiv preprint arxiv:2205.05357, 2022 - arxiv.org
Automated audio captioning, a task that mimics human perception as well as innovatively
links audio processing and natural language processing, has overseen much progress over …