A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

A Survey on Speech Deepfake Detection

M Li, Y Ahmadiadli, XP Zhang - ACM Computing Surveys, 2025 - dl.acm.org
The availability of smart devices leads to an exponential increase in multimedia content.
However, advancements in deep learning have also enabled the creation of highly …

Graph attention-based deep embedded clustering for speaker diarization

Y Wei, H Guo, Z Ge, Z Yang - Speech Communication, 2023 - Elsevier
Deep speaker embedding extraction models have recently served as the cornerstone for
modular speaker diarization systems. However, in current modular systems, the extracted …

Speaker verification using attentive multi-scale convolutional recurrent network

Y Li, Z Jiang, W Cao, Q Huang - Applied Soft Computing, 2022 - Elsevier
In this paper, we propose a speaker verification method by an Attentive Multi-scale
Convolutional Recurrent Network (AMCRN). The proposed AMCRN can acquire both local …

Audio Anti-Spoofing Detection: A Survey

M Li, Y Ahmadiadli, XP Zhang - arxiv preprint arxiv:2404.13914, 2024 - arxiv.org
The availability of smart devices leads to an exponential increase in multimedia content.
However, the rapid advancements in deep learning have given rise to sophisticated …

[HTML][HTML] Class token and knowledge distillation for multi-head self-attention speaker verification systems

V Mingote, A Miguel, A Ortega, E Lleida - Digital Signal Processing, 2023 - Elsevier
This paper explores three novel approaches to improve the performance of speaker
verification (SV) systems based on deep neural networks (DNN) using Multi-head Self …

End-to-end deep speaker embedding learning using multi-scale attentional fusion and graph neural networks

HB Kashani, S Jazmi - Expert Systems with Applications, 2023 - Elsevier
As an attractive research in biometric authentication, Text Independent Speaker Verification
(TI-SV) problem aims to specify whether two given unconstrained utterances come from the …

One-Step Knowledge Distillation and Fine-Tuning in Using Large Pre-Trained Self-Supervised Learning Models for Speaker Verification

J Heo, C Lim, J Kim, H Shin, HJ Yu - arxiv preprint arxiv:2305.17394, 2023 - arxiv.org
The application of speech self-supervised learning (SSL) models has achieved remarkable
performance in speaker verification (SV). However, there is a computational cost hurdle in …

Distance Metric-Based Open-Set Domain Adaptation for Speaker Verification

J Li, J Han, F Qian, T Zheng, Y He… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
Domain shift poses a significant challenge in speaker verification, especially in open-set
scenarios where the speaker categories are disjoint between the source and target …

Speaker recognition using isomorphic graph attention network based pooling on self-supervised representation

Z Ge, X Xu, H Guo, T Wang, Z Yang - Applied Acoustics, 2024 - Elsevier
The emergence of self-supervised representation (ie, wav2vec 2.0) allows speaker-
recognition approaches to process spoken signals through foundation models built on …