Computational intelligence in processing of speech acoustics: a survey

A Singh, N Kaur, V Kukreja, V Kadyan… - Complex & Intelligent …, 2022‏ - Springer
Speech recognition of a language is a key area in the field of pattern recognition. This paper
presents a comprehensive survey on the speech recognition techniques for non-Indian and …

Framewise speech-nonspeech classification by neural networks for voice activity detection with statistical noise suppression

Y Obuchi - 2016 IEEE International Conference on Acoustics …, 2016‏ - ieeexplore.ieee.org
A new voice activity detection (VAD) algorithm is proposed. The proposed algorithm is the
combination of augmented statistical noise suppression (ASNS) and convolutional neural …

Phase aware deep neural network for noise robust voice activity detection

L Wang, K Phapatanaburi, Z Go… - … on Multimedia and …, 2017‏ - ieeexplore.ieee.org
Phase information is ignored for almost all voice activity detection (VAD). To exploit full
information in the original signal, this paper proposes a deep neural network (DNN) using …

Noise robust voice activity detection using joint phase and magnitude based feature enhancement

K Phapatanaburi, L Wang, Z Oo, W Li… - Journal of ambient …, 2017‏ - Springer
Recently, deep neural network (DNN)-based feature enhancement has been proposed for
many speech applications. DNN-enhanced features have achieved higher performance …

Robust voice activity detection based on concept of modulation transfer function in noisy reverberant environments

S Morita, M Unoki, X Lu, M Akagi - Journal of Signal Processing Systems, 2016‏ - Springer
Voice activity detection (VAD) is used to detect speech and non-speech periods from
observed speech signals. It is an important front-end technique for many speech technology …

[PDF][PDF] CENSREC-4: development of evaluation framework for distant-talking speech recognition under reverberant environments.

M Nakayama, T Nishiura, Y Denda… - …, 2008‏ - me.cs.scitec.kobe-u.ac.jp
In this paper, we newly introduce a collection of databases and evaluation tools called
CENSREC-4, which is an evaluation framework for distant-talking speech under hands-free …

DNN-based voice activity detection using auxiliary speech models in noisy environments

Y Tachioka - 2018 IEEE international conference on acoustics …, 2018‏ - ieeexplore.ieee.org
Voice activity detection (VAD) is essential for automatic speech recognition (ASR) in noisy
environments. Deep neural network (DNN)-based VAD is more powerful than previous …

Noise-robust voice conversion based on sparse spectral map** using non-negative matrix factorization

R Aihara, R Takashima, T Takiguchi… - … on Information and …, 2014‏ - search.ieice.org
This paper presents a voice conversion (VC) technique for noisy environments based on a
sparse representation of speech. Sparse representation-based VC using Non-negative …

Close/distant talker discrimination based on kurtosis of linear prediction residual signals

K Hayashida, M Nakayama, T Nishiura… - … , Speech and Signal …, 2014‏ - ieeexplore.ieee.org
Desired/undesired speech discrimination is as important as speech/non-speech
discrimination to achieve useful applications such as speech interfaces and …

Multimodal voice conversion using non-negative matrix factorization in noisy environments

K Masaka, R Aihara, T Takiguchi… - 2014 IEEE International …, 2014‏ - ieeexplore.ieee.org
This paper presents a multimodal voice conversion (VC) method for noisy environments. In
our previous NMF-based VC method, source exemplars and target exemplars are extracted …