One-class learning towards synthetic voice spoofing detection
Human voices can be used to authenticate the identity of the speaker, but the automatic
speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as …
speaker verification (ASV) systems are vulnerable to voice spoofing attacks, such as …
Voice conversion challenge 2020: Intra-lingual semi-parallel and cross-lingual voice conversion
The voice conversion challenge is a bi-annual scientific event held to compare and
understand different voice conversion (VC) systems built on a common dataset. In 2020, we …
understand different voice conversion (VC) systems built on a common dataset. In 2020, we …
Generalization ability of MOS prediction networks
Automatic methods to predict listener opinions of synthesized speech remain elusive since
listeners, systems being evaluated, characteristics of the speech, and even the instructions …
listeners, systems being evaluated, characteristics of the speech, and even the instructions …
Starganv2-vc: A diverse, unsupervised, non-parallel framework for natural-sounding voice conversion
We present an unsupervised non-parallel many-to-many voice conversion (VC) method
using a generative adversarial network (GAN) called StarGAN v2. Using a combination of …
using a generative adversarial network (GAN) called StarGAN v2. Using a combination of …
Deepfake: definitions, performance metrics and standards, datasets and benchmarks, and a meta-review
Recent advancements in AI, especially deep learning, have contributed to a significant
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …
increase in the creation of new realistic-looking synthetic media (video, image, and audio) …
Evaluation of an audio-video multimodal deepfake dataset using unimodal and multimodal detectors
Significant advancements made in the generation of deepfakes have caused security and
privacy issues. Attackers can easily impersonate a person's identity in an image by replacing …
privacy issues. Attackers can easily impersonate a person's identity in an image by replacing …
Human perception of audio deepfakes
The recent emergence of deepfakes has brought manipulated and generated content to the
forefront of machine learning research. Automatic detection of deepfakes has seen many …
forefront of machine learning research. Automatic detection of deepfakes has seen many …
How do voices from past speech synthesis challenges compare today?
E Cooper, J Yamagishi - arxiv preprint arxiv:2105.02373, 2021 - arxiv.org
Shared challenges provide a venue for comparing systems trained on common data using a
standardized evaluation, and they also provide an invaluable resource for researchers when …
standardized evaluation, and they also provide an invaluable resource for researchers when …
UR channel-robust synthetic speech detection system for ASVspoof 2021
In this paper, we present UR-AIR system submission to the logical access (LA) and the
speech deepfake (DF) tracks of the ASVspoof 2021 Challenge. The LA and DF tasks focus …
speech deepfake (DF) tracks of the ASVspoof 2021 Challenge. The LA and DF tasks focus …
[PDF][PDF] Known-unknown data augmentation strategies for detection of logical access, physical access and speech deepfake attacks: ASVspoof 2021
RK Das - Proc. 2021 Edition of the Automatic Speaker …, 2021 - isca-archive.org
The rise in demand of voice biometric systems also increases the threat from various kinds
of spoofing attacks from unauthorized users. The latest ASVspoof 2021 challenge devotes to …
of spoofing attacks from unauthorized users. The latest ASVspoof 2021 challenge devotes to …