Robust wake word spotting with frame-level cross-modal attention based audio-visual conformer

H Wang, M Cheng, Q Fu, M Li - ICASSP 2024-2024 IEEE …, 2024‏ - ieeexplore.ieee.org
In recent years, neural network-based Wake Word Spotting achieves good performance on
clean audio samples but struggles in noisy environments. Audio-Visual Wake Word Spotting …

A Theoretical Framework for Acoustic Neighbor Embeddings

W Jeon - arxiv preprint arxiv:2412.02164, 2024‏ - arxiv.org
This paper provides a theoretical framework for interpreting acoustic neighbor embeddings,
which are representations of the phonetic content of variable-width audio or text in a fixed …

The Whu Wake Word Lipreading System for the 2024 Chat-Scenario Chinese Lipreading Challenge

H Wang, C Li, F Su, J Liu, H Suo… - 2024 IEEE International …, 2024‏ - ieeexplore.ieee.org
The paper describes the Wake Word Lipreading system developed by the WHU team for the
ChatCLR Challenge 2024. Although Lipreading and Wake Word Spotting have seen …

Leveraging synthetic speech for cif-based customized keyword spotting

S Liu, A Zhang, K Huang, L **e - National Conference on Man-Machine …, 2023‏ - Springer
Customized keyword spotting aims to detect user-defined keywords from continuous
speech, providing flexibility and personalization. Previous research mainly relied on …