A review on speaker diarization systems and approaches

MH Moattar, MM Homayounpour - Speech Communication, 2012 - Elsevier
Speaker indexing or diarization is an important task in audio processing and retrieval.
Speaker diarization is the process of labeling a speech signal with labels corresponding to …

rVAD: An unsupervised segment-based robust voice activity detection method

ZH Tan, N Dehak - Computer speech & language, 2020 - Elsevier
This paper presents an unsupervised segment-based method for robust voice activity
detection (rVAD). The method consists of two passes of denoising followed by a voice …

Marblenet: Deep 1d time-channel separable convolutional neural network for voice activity detection

F Jia, S Majumdar, B Ginsburg - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
We present MarbleNet, an end-to-end neural network for Voice Activity Detection (VAD).
MarbleNet is a deep residual network composed from blocks of 1D time-channel separable …

BreathTrack: detecting regular breathing phases from unannotated acoustic data captured by a smartphone

B Islam, MM Rahman, T Ahmed, MY Ahmed… - Proceedings of the …, 2021 - dl.acm.org
Breathing biomarkers, such as breathing rate, fractional inspiratory time, and inhalation-
exhalation ratio, are vital for monitoring the user's health and well-being. Accurate estimation …

Look&listen: Multi-modal correlation learning for active speaker detection and speech enhancement

J **ong, Y Zhou, P Zhang, L **e… - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Active speaker detection and speech enhancement have become two increasingly attractive
topics in audio-visual scenario understanding. According to their respective characteristics …

Speech enhancement using a DNN-augmented colored-noise Kalman filter

H Yu, WP Zhu, B Champagne - Speech Communication, 2020 - Elsevier
In this paper, we propose a new speech enhancement system using a deep neural network
(DNN)-augmented colored-noise Kalman filter. In our system, both clean speech and noise …

Ultra-low-power voice activity detection system using level-crossing sampling

M Faghani, H Rezaee-Dehsorkh, N Ravanshad… - Electronics, 2023 - mdpi.com
This paper presents an ultra-low-power voice activity detection (VAD) system to discriminate
speech from non-speech parts of audio signals. The proposed VAD system uses level …

End-to-end audiovisual speech activity detection with bimodal recurrent neural models

F Tao, C Busso - Speech Communication, 2019 - Elsevier
Speech activity detection (SAD) plays an important role in current speech processing
systems, including automatic speech recognition (ASR). SAD is particularly difficult in …

[PDF][PDF] Noise Cancellation Based on Voice Activity Detection Using Spectral Variation for Speech Recognition in Smart Home Devices.

JS Park, SH Kim - Intelligent Automation & Soft Computing, 2020 - cdn.techscience.cn
Variety types of smart home devices have a main function of a human-machine interaction
by speech recognition. Speech recognition system may be vulnerable to rapidly changing …

Robust speech activity detection in movie audio: Data resources and experimental evaluation

R Hebbar, K Somandepalli… - ICASSP 2019-2019 …, 2019 - ieeexplore.ieee.org
Speech activity detection in highly variable acoustic conditions is a challenging task. Many
approaches to detect speech activity in such conditions involve an inherent knowledge of …