Mlca-avsr: Multi-layer cross attention fusion based audio-visual speech recognition

H Wang, P Guo, P Zhou, L **e - ICASSP 2024-2024 IEEE …, 2024‏ - ieeexplore.ieee.org
While automatic speech recognition (ASR) systems degrade significantly in noisy
environments, audio-visual speech recognition (AVSR) systems aim to complement the …

Conversational Short-phrase Speaker Diarization via Self-adjusting Speech Segmentation and Embedding Extraction

H Lu, G Cheng, Y Yan - IEEE Signal Processing Letters, 2024‏ - ieeexplore.ieee.org
Conversational short-phrase speaker diarization focuses on diarizing the phrases that are
short in duration. Nonetheless, conventional speaker diarization systems fail to give enough …

A new speaker-diarization technology with denoising spectral-LSTM for online automatic multi-dialogue recording

DY Chan, JF Wang, HT Chin - Multimedia Tools and Applications, 2024‏ - Springer
In AI pandemic applications, the online automatic AI recording apparatus for official councils
such as court trials, business conferences and commercial meetings will become imperative …