Deep neural network techniques for monaural speech enhancement and separation: state of the art analysis

P Ochieng - Artificial Intelligence Review, 2023 - Springer
Deep neural networks (DNN) techniques have become pervasive in domains such as
natural language processing and computer vision. They have achieved great success in …

Overview of speaker modeling and its applications: From the lens of deep speaker representation learning

S Wang, Z Chen, KA Lee, Y Qian… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Speaker individuality information is among the most critical elements within speech signals.
By thoroughly and accurately modeling this information, it can be utilized in various …

Quantitative evidence on overlooked aspects of enrollment speaker embeddings for target speaker separation

X Liu, X Li, J Serrà - ICASSP 2023-2023 IEEE International …, 2023 - ieeexplore.ieee.org
Single channel target speaker separation (TSS) aims at extracting a speaker's voice from a
mixture of multiple talkers given an enrollment utterance of that speaker. A typical deep …

Robust channel learning for large-scale radio speaker verification

W Yang, J Wei, W Lu, L Li, X Lu - IEEE Journal of Selected …, 2024 - ieeexplore.ieee.org
Recent research in speaker verification has increasingly focused on achieving robust and
reliable recognition under challenging channel conditions and noisy environments …

SEDENOSS: SEparating and DENOising Seismic Signals with dual‐path recurrent neural network architecture

A Novoselov, P Balazs… - Journal of Geophysical …, 2022 - Wiley Online Library
Seismologists have to deal with overlap** and noisy signals. Techniques such as source
separation can be used to solve this problem. Over the past few decades, signal processing …

Towards robust speaker verification with target speaker enhancement

C Zhang, M Yu, C Weng, D Yu - ICASSP 2021-2021 IEEE …, 2021 - ieeexplore.ieee.org
This paper proposes the target speaker enhancement based speaker verification network
(TASE-SVNet), an all neural model that couples target speaker enhancement and speaker …

Optimal Transport with Class Structure Exploration for Cross-Domain Speech Emotion Recognition

R Zhang, J Wei, X Lu, J Xu, Y Li, D **… - … on Audio, Speech …, 2024 - ieeexplore.ieee.org
Speech emotion recognition (SER) has widespread applications in human-computer
interaction. However, the performance of SER models often drops in domain mismatch …