Speaker identification features extraction methods: A systematic review

SS Tirumala, SR Shahamiri, AS Garhwal… - Expert Systems with …, 2017 - Elsevier
Speaker Identification (SI) is the process of identifying the speaker from a given utterance by
comparing the voice biometrics of the utterance with those utterance models stored …

Emotion recognition from speech using wav2vec 2.0 embeddings

L Pepino, P Riera, L Ferrer - arxiv preprint arxiv:2104.03502, 2021 - arxiv.org
Emotion recognition datasets are relatively small, making the use of the more sophisticated
deep learning approaches challenging. In this work, we propose a transfer learning method …

Fusing MFCC and LPC features using 1D triplet CNN for speaker recognition in severely degraded audio signals

A Chowdhury, A Ross - IEEE transactions on information …, 2019 - ieeexplore.ieee.org
Speaker recognition algorithms are negatively impacted by the quality of the input speech
signal. In this work, we approach the problem of speaker recognition from severely …

Noise invariant frame selection: a simple method to address the background noise problem for text-independent speaker verification

S Song, S Zhang, BW Schuller, L Shen… - … Joint Conference on …, 2018 - ieeexplore.ieee.org
The performance of speaker-related systems usually degrades heavily in practical
applications largely due to the presence of background noise. To improve the robustness of …

Emotion recognition using hybrid Gaussian mixture model and deep neural network

I Shahin, AB Nassif, S Hamsa - IEEE access, 2019 - ieeexplore.ieee.org
This paper aims at recognizing emotions for a text-independent and speaker-independent
emotion recognition system based on a novel classifier, which is a hybrid of a cascaded …

CASA-based speaker identification using cascaded GMM-CNN classifier in noisy and emotional talking conditions

AB Nassif, I Shahin, S Hamsa, N Nemmour… - Applied Soft …, 2021 - Elsevier
This work aims at intensifying text-independent speaker identification performance in real
application situations such as noisy and emotional talking conditions. This is achieved by …

An analysis of the influence of deep neural network (DNN) topology in bottleneck feature based language recognition

A Lozano-Diez, R Zazo, DT Toledano… - PloS one, 2017 - journals.plos.org
Language recognition systems based on bottleneck features have recently become the state-
of-the-art in this research field, showing its success in the last Language Recognition …

HMM-based phrase-independent i-vector extractor for text-dependent speaker verification

H Zeinali, H Sameti, L Burget - IEEE/ACM Transactions on …, 2017 - ieeexplore.ieee.org
The low-dimensional i-vector representation of speech segments is used in the state-of-the-
art text-independent speaker verification systems. However, i-vectors were deemed …

Novel cascaded Gaussian mixture model-deep neural network classifier for speaker identification in emotional talking environments

I Shahin, AB Nassif, S Hamsa - Neural Computing and Applications, 2020 - Springer
This research is an effort to present an effective approach to enhance text-independent
speaker identification performance in emotional talking environments based on novel …

Analysis of DNN speech signal enhancement for robust speaker recognition

O Novotný, O Plchot, O Glembek, L Burget - Computer Speech & …, 2019 - Elsevier
In this work, we present an analysis of a DNN-based autoencoder for speech enhancement,
dereverberation and denoising. The target application is a robust speaker verification (SV) …