Deep spoken keyword spotting: An overview

I López-Espejo, ZH Tan, JHL Hansen, J Jensen - IEEE Access, 2021 - ieeexplore.ieee.org
Spoken keyword spotting (KWS) deals with the identification of keywords in audio streams
and has become a fast-growing technology thanks to the paradigm shift introduced by deep …

Language model modification for local speech recognition systems using remote sources

M Deisher, G Stemmer - US Patent 10,325,590, 2019 - Google Patents
A language model is modified for a local speech recognition system using remote speech
recognition sources. In one example, a speech utterance is received. The speech utterance …

Query-by-example keyword spotting using long short-term memory networks

G Chen, C Parada, TN Sainath - 2015 IEEE international …, 2015 - ieeexplore.ieee.org
We present a novel approach to query-by-example keyword spotting (KWS) using a long
short-term memory (LSTM) recurrent neural network-based feature extractor. In our …

[HTML][HTML] Compressed time delay neural network for small-footprint keyword spotting

M Sun, D Snyder, Y Gao, V Nagaraja, M Rodehorst… - 2017 - amazon.science
In this paper we investigate a time delay neural network (TDNN) for a keyword spotting task
that requires low CPU, memory and latency. The TDNN is trained with transfer learning and …

Max-pooling loss training of long short-term memory networks for small-footprint keyword spotting

M Sun, A Raju, G Tucker… - 2016 IEEE spoken …, 2016 - ieeexplore.ieee.org
We propose a max-pooling based loss function for training Long Short-Term Memory
(LSTM) networks for small-footprint keyword spotting (KWS), with low CPU, memory, and …

Spoken content retrieval—beyond cascading speech recognition with text retrieval

L Lee, J Glass, H Lee, C Chan - IEEE/ACM Transactions on …, 2015 - ieeexplore.ieee.org
Spoken content retrieval refers to directly indexing and retrieving spoken content based on
the audio rather than text descriptions. This potentially eliminates the requirement of …

Monophone-based background modeling for two-stage on-device wake word detection

M Wu, S Panchapagesan, M Sun, J Gu… - … , Speech and Signal …, 2018 - ieeexplore.ieee.org
Accurate on-device wake word detection is crucial to products with far-field voice control
such as the Amazon Echo. It is quite challenging to build a wake word system with both low …

Neural-network lexical translation for cross-lingual IR from text and speech

R Zbib, L Zhao, D Karakos, W Hartmann… - Proceedings of the …, 2019 - dl.acm.org
We propose a neural network model to estimate word translation probabilities for Cross-
Lingual Information Retrieval (CLIR). The model estimates better probabilities for word …

[PDF][PDF] The Kaldi OpenKWS System: Improving Low Resource Keyword Search.

J Trmal, M Wiesner, V Peddinti, X Zhang… - Interspeech, 2017 - researchgate.net
The IARPA BABEL program has stimulated worldwide research in keyword search
technology for low resource languages, and the NIST OpenKWS evaluations are the de …

Improving Neural Biasing for Contextual Speech Recognition by Early Context Injection and Text Perturbation

R Huang, M Yarmohammadi, S Khudanpur… - arxiv preprint arxiv …, 2024 - arxiv.org
Existing research suggests that automatic speech recognition (ASR) models can benefit
from additional contexts (eg, contact lists, user specified vocabulary). Rare words and …