Cross-corpora spoken language identification with domain diversification and generalization

S Dey, M Sahidullah, G Saha - Computer Speech & Language, 2023 - Elsevier
This work addresses the cross-corpora generalization issue for the low-resourced spoken
language identification (LID) problem. We have conducted the experiments in the context of …

Towards Cross-Corpora Generalization for Low-Resource Spoken Language Identification

S Dey, M Sahidullah, G Saha - IEEE/ACM Transactions on …, 2024 - ieeexplore.ieee.org
Low-resource spoken language identification (LID) systems are prone to poor generalization
across unknown domains. In this study, using multiple widely used low-resourced South …

Wavelet scattering transform for improving generalization in low-resourced spoken language identification

S Dey, P Singh, G Saha - arxiv preprint arxiv:2310.00602, 2023 - arxiv.org
Commonly used features in spoken language identification (LID), such as mel-spectrogram
or MFCC, lose high-frequency information due to windowing. The loss further increases for …

Multi-Dataset Co-Training with Sharpness-Aware Optimization for Audio Anti-spoofing

H Shim, J Jung, T Kinnunen - arxiv preprint arxiv:2305.19953, 2023 - arxiv.org
Audio anti-spoofing for automatic speaker verification aims to safeguard users' identities
from spoofing attacks. Although state-of-the-art spoofing countermeasure (CM) models …

Beyond silence: Bias analysis through loss and asymmetric approach in audio anti-spoofing

H Shim, M Sahidullah, J Jung, S Watanabe… - arxiv preprint arxiv …, 2024 - arxiv.org
Current trends in audio anti-spoofing detection research strive to improve models' ability to
generalize across unseen attacks by learning to identify a variety of spoofing artifacts. This …

[PDF][PDF] Balance, Multiple Augmentation, and Re-synthesis: A Triad Training Strategy for Enhanced Audio Deepfake Detection

TP Doan, L Nguyen-Vu, K Hong, S Jung - Proc. Interspeech 2024, 2024 - isca-archive.org
The detection of deepfake voices has become increasingly challenging. Finding the
boundary that separates real and synthetic voices requires a good training set and an …

Self-distillation framework for improving fake speech detection in the domain variability scenario

V Samhita, V Viju, B Bharathi - Neural Computing and Applications, 2024 - Springer
Robust fake speech detection systems are crucial in an era where audio recordings can be
easily altered or developed due to advancements in technology. The potential impact of this …

Robustness of language recognition system to transmission channel

R Duroselle - 2021 - hal.science
Language recognition is the task of predicting the language used in a test speech utterance.
Since 2017, the best performing systems have been based on a deep neural network which …