Speaker recognition based on deep learning: An overview
Speaker recognition is a task of identifying persons from their voices. Recently, deep
learning has dramatically revolutionized speaker recognition. However, there is lack of …
learning has dramatically revolutionized speaker recognition. However, there is lack of …
Overview of speaker modeling and its applications: From the lens of deep speaker representation learning
Speaker individuality information is among the most critical elements within speech signals.
By thoroughly and accurately modeling this information, it can be utilized in various …
By thoroughly and accurately modeling this information, it can be utilized in various …
Meta-learning for short utterance speaker recognition with imbalance length pairs
In practical settings, a speaker recognition system needs to identify a speaker given a short
utterance, while the enrollment utterance may be relatively long. However, existing speaker …
utterance, while the enrollment utterance may be relatively long. However, existing speaker …
RawNeXt: Speaker verification system for variable-duration utterances with deep layer aggregation and extended dynamic scaling policies
Despite achieving satisfactory performance in speaker verification using deep neural
networks, variable-duration utterances remain a challenge that threatens the robustness of …
networks, variable-duration utterances remain a challenge that threatens the robustness of …
Online end-to-end neural diarization with speaker-tracing buffer
Y Xue, S Horiguchi, Y Fujita… - 2021 IEEE Spoken …, 2021 - ieeexplore.ieee.org
This paper proposes a novel online speaker diarization algorithm based on a fully
supervised self-attention mechanism (SA-EEND). Online diarization inherently presents a …
supervised self-attention mechanism (SA-EEND). Online diarization inherently presents a …
Replay attack detection with complementary high-resolution information using end-to-end dnn for the asvspoof 2019 challenge
In this study, we concentrate on replacing the process of extracting hand-crafted acoustic
feature with end-to-end DNN using complementary high-resolution spectrograms. As a …
feature with end-to-end DNN using complementary high-resolution spectrograms. As a …
ECAPA2: A hybrid neural network architecture and training strategy for robust speaker embeddings
J Thienpondt, K Demuynck - 2023 IEEE Automatic Speech …, 2023 - ieeexplore.ieee.org
In this paper, we present ECAPA2, a novel hybrid neural network architecture and training
strategy to produce robust speaker embeddings. Most speaker verification models are …
strategy to produce robust speaker embeddings. Most speaker verification models are …
Knowledge distillation in acoustic scene classification
Common acoustic properties that different classes share degrades the performance of
acoustic scene classification systems. This results in a phenomenon where a few confusing …
acoustic scene classification systems. This results in a phenomenon where a few confusing …
Graph attentive feature aggregation for text-independent speaker verification
The objective of this paper is to combine multiple frame-level features into a single utterance-
level representation considering pair-wise relationships. For this purpose, we propose a …
level representation considering pair-wise relationships. For this purpose, we propose a …
Towards robust speaker verification with target speaker enhancement
This paper proposes the target speaker enhancement based speaker verification network
(TASE-SVNet), an all neural model that couples target speaker enhancement and speaker …
(TASE-SVNet), an all neural model that couples target speaker enhancement and speaker …