Recent advances in speech language models: A survey
Large Language Models (LLMs) have recently garnered significant attention, primarily for
their capabilities in text-based interactions. However, natural human interaction often relies …
their capabilities in text-based interactions. However, natural human interaction often relies …
Conferencingspeech 2022 challenge: Non-intrusive objective speech quality assessment (NISQA) challenge for online conferencing applications
With the advances in speech communication systems such as online conferencing
applications, we can seamlessly work with people regardless of where they are. However …
applications, we can seamlessly work with people regardless of where they are. However …
The VoiceMOS Challenge 2023: zero-shot subjective speech quality prediction for multiple domains
We present the second edition of the VoiceMOS Challenge, a scientific event that aims to
promote the study of automatic prediction of the mean opinion score (MOS) of synthesized …
promote the study of automatic prediction of the mean opinion score (MOS) of synthesized …
Torchaudio-squim: Reference-less speech quality and intelligibility measures in torchaudio
Measuring quality and intelligibility of a speech signal is usually a critical step in
development of speech processing systems. To enable this, a variety of metrics to measure …
development of speech processing systems. To enable this, a variety of metrics to measure …
{SMACK}: Semantically Meaningful Adversarial Audio Attack
Voice controllable systems rely on speech recognition and speaker identification as the key
enabling technologies. While they bring revolutionary changes to our daily lives, their …
enabling technologies. While they bring revolutionary changes to our daily lives, their …
ICASSP 2024 speech signal improvement challenge
The ICASSP 2024 Speech Signal Improvement Challenge aims to advance research in
enhancing speech signal quality within communication systems. The speech signal quality …
enhancing speech signal quality within communication systems. The speech signal quality …
Antifake: Using adversarial audio to prevent unauthorized speech synthesis
The rapid development of deep neural networks and generative AI has catalyzed growth in
realistic speech synthesis. While this technology has great potential to improve lives, it also …
realistic speech synthesis. While this technology has great potential to improve lives, it also …
When evil calls: Targeted adversarial voice over ip network
As the COVID-19 pandemic fundamentally reshaped the remote life and working styles,
Voice over IP (VoIP) telephony and video conferencing have become a primary method of …
Voice over IP (VoIP) telephony and video conferencing have become a primary method of …
Speech quality assessment through MOS using non-matching references
Human judgments obtained through Mean Opinion Scores (MOS) are the most reliable way
to assess the quality of speech signals. However, several recent attempts to automatically …
to assess the quality of speech signals. However, several recent attempts to automatically …
Can I hear your face? pervasive attack on voice authentication systems with a single face image
We present Foice, a novel deepfake attack against voice authentication systems. Foice
generates a synthetic voice of the victim from just a single image of the victim's face, without …
generates a synthetic voice of the victim from just a single image of the victim's face, without …