A review of deep learning techniques for speech processing

A Mehrish, N Majumder, R Bharadwaj, R Mihalcea… - Information …, 2023 - Elsevier
The field of speech processing has undergone a transformative shift with the advent of deep
learning. The use of multiple processing layers has enabled the creation of models capable …

Cn-celeb: multi-genre speaker recognition

L Li, R Liu, J Kang, Y Fan, H Cui, Y Cai, R Vipperla… - Speech …, 2022 - Elsevier
Research on speaker recognition is extending to address the vulnerability in the wild
conditions, among which genre mismatch is perhaps the most challenging, for instance …

Domain generalization with relaxed instance frequency-wise normalization for multi-device acoustic scene classification

B Kim, S Yang, J Kim, H Park, J Lee… - arxiv preprint arxiv …, 2022 - arxiv.org
While using two-dimensional convolutional neural networks (2D-CNNs) in image
processing, it is possible to manipulate domain information using channel statistics, and …

Meta-generalization for domain-invariant speaker verification

H Zhang, L Wang, KA Lee, M Liu… - … /ACM Transactions on …, 2023 - ieeexplore.ieee.org
Automatic speaker verification (ASV) exhibits unsatisfactory performance under domain
mismatch conditions owing to intrinsic and extrinsic factors, such as variations in speaking …

Playing a part: Speaker verification at the movies

A Brown, J Huh, A Nagrani, JS Chung… - ICASSP 2021-2021 …, 2021 - ieeexplore.ieee.org
The goal of this work is to investigate the performance of popular speaker recognition
models on speech segments from movies, where often actors intentionally disguise their …

A model-agnostic meta-baseline method for few-shot fault diagnosis of wind turbines

X Liu, W Teng, Y Liu - Sensors, 2022 - mdpi.com
The technology of fault diagnosis is helpful to improve the reliability of wind turbines, and
further reduce the operation and maintenance cost at wind farms. However, in reality, wind …

[PDF][PDF] Mutual information-based embedding decoupling for generalizable speaker verification

J Li, J Han, S Deng, T Zheng, Y He, G Zheng - Proc. Interspeech, 2023 - isca-archive.org
Abstract Domain shift is a challenging problem in speaker verification, especially when
dealing with unseen target domains. Recently, embedding decoupling-based methods have …

Model-agnostic meta-learning for fast text-dependent speaker embedding adaptation

W Lin, MW Mak - IEEE/ACM Transactions on Audio, Speech …, 2023 - ieeexplore.ieee.org
By constraining the lexical content of input speech, text-dependent speaker verification (TD-
SV) offers more reliable performance than text-independent speaker verification (TI-SV) …

Improving generalization ability of countermeasures for new mismatch scenario by combining multiple advanced regularization terms

C Zeng, X Wang, X Miao, E Cooper… - arxiv preprint arxiv …, 2023 - arxiv.org
The ability of countermeasure models to generalize from seen speech synthesis methods to
unseen ones has been investigated in the ASVspoof challenge. However, a new mismatch …

Learning domain-invariant transformation for speaker verification

H Zhang, L Wang, KA Lee, M Liu… - ICASSP 2022-2022 …, 2022 - ieeexplore.ieee.org
Automatic speaker verification (ASV) faces domain shift caused by the mismatch of intrinsic
and extrinsic factors such as recording device and speaking style in real-world applications …