Cross-corpora spoken language identification with domain diversification and generalization
This work addresses the cross-corpora generalization issue for the low-resourced spoken
language identification (LID) problem. We have conducted the experiments in the context of …
language identification (LID) problem. We have conducted the experiments in the context of …
Towards Cross-Corpora Generalization for Low-Resource Spoken Language Identification
Low-resource spoken language identification (LID) systems are prone to poor generalization
across unknown domains. In this study, using multiple widely used low-resourced South …
across unknown domains. In this study, using multiple widely used low-resourced South …
Wavelet scattering transform for improving generalization in low-resourced spoken language identification
Commonly used features in spoken language identification (LID), such as mel-spectrogram
or MFCC, lose high-frequency information due to windowing. The loss further increases for …
or MFCC, lose high-frequency information due to windowing. The loss further increases for …
Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing
Audio anti-spoofing for automatic speaker verification aims to safeguard users' identities
from spoofing attacks. Although state-of-the-art spoofing countermeasure (CM) models …
from spoofing attacks. Although state-of-the-art spoofing countermeasure (CM) models …
Beyond silence: Bias analysis through loss and asymmetric approach in audio anti-spoofing
Current trends in audio anti-spoofing detection research strive to improve models' ability to
generalize across unseen attacks by learning to identify a variety of spoofing artifacts. This …
generalize across unseen attacks by learning to identify a variety of spoofing artifacts. This …
[PDF][PDF] Balance, Multiple Augmentation, and Re-synthesis: A Triad Training Strategy for Enhanced Audio Deepfake Detection
The detection of deepfake voices has become increasingly challenging. Finding the
boundary that separates real and synthetic voices requires a good training set and an …
boundary that separates real and synthetic voices requires a good training set and an …
Self-distillation framework for improving fake speech detection in the domain variability scenario
V Samhita, V Viju, B Bharathi - Neural Computing and Applications, 2024 - Springer
Robust fake speech detection systems are crucial in an era where audio recordings can be
easily altered or developed due to advancements in technology. The potential impact of this …
easily altered or developed due to advancements in technology. The potential impact of this …
Robustness of language recognition system to transmission channel
R Duroselle - 2021 - hal.science
Language recognition is the task of predicting the language used in a test speech utterance.
Since 2017, the best performing systems have been based on a deep neural network which …
Since 2017, the best performing systems have been based on a deep neural network which …