Comparative Study of Tokenization Algorithms for End-to-End Open Vocabulary Keyword Detection

K Gurugubelli, S Mohamed… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
The advent of Deep-Learning techniques and the increasing importance of personalization
in voice assistants fueled the need for open vocabulary keyword detection systems, in …

U2-kws: Unified two-pass open-vocabulary keyword spotting with keyword bias

A Zhang, P Zhou, K Huang, Y Zou… - 2023 IEEE Automatic …, 2023 - ieeexplore.ieee.org
Open-vocabulary keyword spotting (KWS), which allows users to customize keywords, has
attracted increasingly more interest. However, existing methods based on acoustic models …

Achieving timestamp prediction while recognizing with non-autoregressive end-to-end asr model

X Shi, Y Chen, S Zhang, Z Yan - National Conference on Man-Machine …, 2022 - Springer
Conventional ASR systems use frame-level phoneme posterior to conduct force-alignment
(FA) and provide timestamps, while end-to-end ASR systems especially AED based ones …

TDT-KWS: Fast and accurate keyword spotting using token-and-duration transducer

Y **, H Li, B Yang, H Li, H Xu… - ICASSP 2024-2024 IEEE …, 2024 - ieeexplore.ieee.org
Designing an efficient keyword spotting (KWS) system that delivers exceptional performance
on resource-constrained edge devices has long been a subject of significant attention …

Global-Local Convolution with Spiking Neural Networks for Energy-efficient Keyword Spotting

S Wang, D Zhang, K Shi, Y Wang, W Wei, J Wu… - arxiv preprint arxiv …, 2024 - arxiv.org
Thanks to Deep Neural Networks (DNNs), the accuracy of Keyword Spotting (KWS) has
made substantial progress. However, as KWS systems are usually implemented on edge …

Transformer-based encoder-encoder architecture for spoken term detection

J Švec, L Šmídl, J Lehečka - Asian Conference on Pattern Recognition, 2023 - Springer
The paper presents a method for spoken term detection based on the Transformer
architecture. We propose the encoder-encoder architecture employing two BERT-like …

[PDF][PDF] Custom Mandarin Keyword Spotting with Extended Long Short-Term Memory.

H Cao, X Liu, Z Tan, Z Yang, X Qin - IAENG International Journal of …, 2024 - iaeng.org
In real-world scenarios, Deep Neural Network (DNN)-powered Keyword Spotting (KWS)
systems are typically engineered as lightweight architectures, optimizing for superior …

Few-Shot Open-Set Keyword Spotting with Multi-Stage Training

LY Li, TH Lo, JW Hung, SC Huang… - 2024 Asia Pacific Signal …, 2024 - ieeexplore.ieee.org
As the advance of human-computer interaction technologies continued, keyword spotting
(KWS) systems have gained prominence in everyday devices. This study is dedicated to …

End-to-End Streaming Customizable Keyword Spotting Based on Text-Adaptive Neural Search

B Yang, J Guo, H Li, Y **, Q Zhuo, K Yu - National Conference on Man …, 2023 - Springer
Streaming keyword spotting (KWS) is an important technique for voice assistant wake-up.
While KWS with a preset fixed keyword has been well studied, test-time customizable …

**an Shi), Yanni Chen, Shiliang Zhang, and Zhijie Yan Speech Lab, Alibaba Group, Hangzhou, China {shixian. shi, cyn244124, sly. zsl, zhijie. yzj}@ alibaba-inc. com

ATP While, ASR End-to-End - Man-Machine Speech …, 2023 - books.google.com
Conventional ASR systems use frame-level phoneme posterior to conduct force-alignment
(FA) and provide timestamps, while endto-end ASR systems especially AED based ones are …