A comprehensive empirical review of modern voice activity detection approaches for movies and TV shows

M Sharma, S Joshi, T Chatterjee, R Hamid - Neurocomputing, 2022 - Elsevier
A robust and language agnostic Voice Activity Detection (VAD) is crucial for Digital
Entertainment Content (DEC). Primary examples of DEC include movies and TV series …

An automated assessment framework for atypical prosody and stereotyped idiosyncratic phrases related to autism spectrum disorder

M Li, D Tang, J Zeng, T Zhou, H Zhu, B Chen… - Computer Speech & …, 2019 - Elsevier
Abstract Autism Spectrum Disorder (ASD), a neurodevelopmental disability, has become
one of the high incidence diseases among children. Studies indicate that early diagnosis …

Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal

H Mukherjee, SM Obaidullah, KC Santosh… - International Journal of …, 2018 - Springer
Voice activity detection (VAD) refers to the task of identifying vocal segments from an audio
clip. It helps in reducing the computational overhead as well elevate the recognition …

Robust voice activity detection using an auditory-inspired masked modulation encoder based convolutional attention network

N Li, L Wang, M Ge, M Unoki, S Li, J Dang - Speech Communication, 2024 - Elsevier
Deep learning has revolutionized voice activity detection (VAD) by offering promising
solutions. However, directly applying traditional features, such as raw waveforms and Mel …

AUC optimization for deep learning-based voice activity detection

XL Zhang, M Xu - EURASIP Journal on Audio, Speech, and Music …, 2022 - Springer
Voice activity detection (VAD) based on deep neural networks (DNN) have demonstrated
good performance in adverse acoustic environments. Current DNN-based VAD optimizes a …

RS-MSConvNet: A novel end-to-end pathological voice detection model

W Pathonsuwan, K Phapatanaburi, P Buayai… - IEEE …, 2022 - ieeexplore.ieee.org
Recent studies have reported the success of multi-scale convolution neural network
(MSConvNet) model for many classification applications due to its powerful ability of …

[HTML][HTML] Whispered speech detection using glottal flow-based features

K Phapatanaburi, W Pathonsuwan, L Wang… - Symmetry, 2022 - mdpi.com
Recent studies have reported that the performance of Automatic Speech Recognition (ASR)
technologies designed for normal speech notably deteriorates when it is evaluated by …

Robust voice activity detection using a masked auditory encoder based convolutional neural network

N Li, L Wang, M Unoki, S Li, R Wang… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
Voice activity detection (VAD) based on deep learning has achieved remarkable success.
However, when the traditional features (eg, raw waveforms and MFCCs) are directly fed to …

[PDF][PDF] Lower-Limb Motion-Based Ankle-Foot Movement Classification Using 2D-CNN

N Chaobankoh, T Jumphoo, M Uthansakul… - Comput Mater …, 2022 - researchgate.net
Recently, the Muscle-Computer Interface (MCI) has been extensively popular for employing
Electromyography (EMG) signals to help the development of various assistive devices …

Voice activity detection systems and methods

SM Kaskari, F Nesta - US Patent 10,504,539, 2019 - Google Patents
An audio processing device or method includes an audio transducer operable to receive
audio input and generate an audio signal based on the audio input. The audio processing …