Introduction to voice presentation attack detection and recent advances

M Sahidullah, H Delgado, M Todisco, A Nautsch… - Handbook of Biometric …, 2023 - Springer
Over the past few years, significant progress has been made in the field of presentation
attack detection (PAD) for automatic speaker recognition (ASV). This includes the …

ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech

X Wang, J Yamagishi, M Todisco, H Delgado… - Computer Speech & …, 2020 - Elsevier
Automatic speaker verification (ASV) is one of the most natural and convenient means of
biometric person recognition. Unfortunately, just like all other biometric systems, ASV is …

ASVspoof 5: Crowdsourced speech data, deepfakes, and adversarial attacks at scale

X Wang, H Delgado, H Tak, J Jung, H Shim… - arxiv preprint arxiv …, 2024 - arxiv.org
ASVspoof 5 is the fifth edition in a series of challenges that promote the study of speech
spoofing and deepfake attacks, and the design of detection solutions. Compared to previous …

The limits of the mean opinion score for speech synthesis evaluation

S Le Maguer, S King, N Harte - Computer Speech & Language, 2024 - Elsevier
The release of WaveNet and Tacotron has forever transformed the speech synthesis
landscape. Thanks to these game-changing innovations, the quality of synthetic speech has …

A Comprehensive Survey with Critical Analysis for Deepfake Speech Detection

L Pham, P Lam, T Nguyen, H Tang, D Tran… - arxiv preprint arxiv …, 2024 - arxiv.org
Thanks to advancements in deep learning, speech generation systems now power a variety
of real-world applications, such as text-to-speech for individuals with speech disorders …

[PDF][PDF] Back to the Future: Extending the Blizzard Challenge 2013.

S Le Maguer, S King, N Harte - INTERSPEECH, 2022 - researchgate.net
Nowadays, speech synthesis technology is synonymous with the use of Deep Learning. To
understand more about how synthesis systems have progressed with the advent of Deep …

[HTML][HTML] Phonetic accommodation in interaction with a virtual language learning tutor: A Wizard-of-Oz study

I Gessinger, B Möbius, S Le Maguer, E Raveh… - Journal of …, 2021 - Elsevier
We present a Wizard-of-Oz experiment examining phonetic accommodation of human
interlocutors in the context of human-computer interaction. Forty-two native speakers of …

Liaison and pronunciation learning in end-to-end text-to-speech in French

J Taylor, S Le Maguer, K Richmond - The 11th ISCA Speech …, 2021 - research.ed.ac.uk
Abstract Sequence-to-sequence (S2S) TTS models like Tacotron have grapheme-only
inputs when trained fully end-to-end. Grapheme inputs map to phone sounds depending on …

Putting robots in context: Challenging the influence of voice and empathic behaviour on trust

M Romeo, I Torre, S Le Maguer… - 2023 32nd IEEE …, 2023 - ieeexplore.ieee.org
Trust is essential for social interactions, including those between humans and social artificial
agents, such as robots. Several robot-related factors can contribute to the formation of trust …

ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech

X Wang, H Delgado, H Tak, J Jung, H Shim… - arxiv preprint arxiv …, 2025 - arxiv.org
ASVspoof 5 is the fifth edition in a series of challenges which promote the study of speech
spoofing and deepfake attacks as well as the design of detection solutions. We introduce the …