الباحث العلمي من Google

W Cui, D Yu, X Jiao, Z Meng, G Zhang, Q Wang… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Large Language Models (LLMs) have recently garnered significant attention, primarily for
their capabilities in text-based interactions. However, natural human interaction often relies …‏

حفظ اقتباس تم اقتباسها في عدد: 5 مقالات ذات صلة الإصدارات الـ 2كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Conferencingspeech 2022 challenge: Non-intrusive objective speech quality assessment (NISQA) challenge for online conferencing applications‏

G Yi, W **ao, Y **ao, B Naderi, S Möller… - arxiv preprint arxiv …, 2022‏ - arxiv.org‏

With the advances in speech communication systems such as online conferencing
applications, we can seamlessly work with people regardless of where they are. However …‏

حفظ اقتباس تم اقتباسها في عدد: 41 مقالات ذات صلة الإصدارات الـ 10كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The VoiceMOS Challenge 2023: zero-shot subjective speech quality prediction for multiple domains‏

E Cooper, WC Huang, Y Tsao… - 2023 IEEE Automatic …, 2023‏ - ieeexplore.ieee.org‏

We present the second edition of the VoiceMOS Challenge, a scientific event that aims to
promote the study of automatic prediction of the mean opinion score (MOS) of synthesized …‏

حفظ اقتباس تم اقتباسها في عدد: 33 مقالات ذات صلة الإصدارات الـ 6كلها

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Torchaudio-squim: Reference-less speech quality and intelligibility measures in torchaudio‏

A Kumar, K Tan, Z Ni, P Manocha… - ICASSP 2023-2023 …, 2023‏ - ieeexplore.ieee.org‏

Measuring quality and intelligibility of a speech signal is usually a critical step in
development of speech processing systems. To enable this, a variety of metrics to measure …‏

حفظ اقتباس تم اقتباسها في عدد: 44 مقالات ذات صلة الإصدارات الـ 3كلها

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

{SMACK}: Semantically Meaningful Adversarial Audio Attack‏

Z Yu, Y Chang, N Zhang, C **ao - 32nd USENIX Security Symposium …, 2023‏ - usenix.org‏

Voice controllable systems rely on speech recognition and speaker identification as the key
enabling technologies. While they bring revolutionary changes to our daily lives, their …‏

حفظ اقتباس تم اقتباسها في عدد: 21 مقالات ذات صلة الإصدارات الـ 4كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org Full View‏

ICASSP 2024 speech signal improvement challenge‏

NC Ristea, B Naderi, A Saabas, R Cutler… - IEEE Open Journal …, 2025‏ - ieeexplore.ieee.org‏

The ICASSP 2024 Speech Signal Improvement Challenge aims to advance research in
enhancing speech signal quality within communication systems. The speech signal quality …‏

حفظ اقتباس تم اقتباسها في عدد: 11 مقالات ذات صلة

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Antifake: Using adversarial audio to prevent unauthorized speech synthesis‏

Z Yu, S Zhai, N Zhang - Proceedings of the 2023 ACM SIGSAC …, 2023‏ - dl.acm.org‏

The rapid development of deep neural networks and generative AI has catalyzed growth in
realistic speech synthesis. While this technology has great potential to improve lives, it also …‏

حفظ اقتباس تم اقتباسها في عدد: 21 مقالات ذات صلة الإصدارات الـ 5كلها

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

When evil calls: Targeted adversarial voice over ip network‏

H Liu, Z Yu, M Zha, XF Wang, W Yeoh… - Proceedings of the …, 2022‏ - dl.acm.org‏

As the COVID-19 pandemic fundamentally reshaped the remote life and working styles,
Voice over IP (VoIP) telephony and video conferencing have become a primary method of …‏

حفظ اقتباس تم اقتباسها في عدد: 18 مقالات ذات صلة الإصدارات الـ 5كلها

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Speech quality assessment through MOS using non-matching references‏

P Manocha, A Kumar - arxiv preprint arxiv:2206.12285, 2022‏ - arxiv.org‏

Human judgments obtained through Mean Opinion Scores (MOS) are the most reliable way
to assess the quality of speech signals. However, several recent attempts to automatically …‏

حفظ اقتباس تم اقتباسها في عدد: 33 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Can I hear your face? pervasive attack on voice authentication systems with a single face image‏

N Jiang, B Sun, T Sim, J Han - 33rd USENIX Security Symposium …, 2024‏ - usenix.org‏

We present Foice, a novel deepfake attack against voice authentication systems. Foice
generates a synthetic voice of the victim from just a single image of the victim's face, without …‏

حفظ اقتباس تم اقتباسها في عدد: 3 مقالات ذات صلة الإصدارات الـ 5كلها إصدار HTML‏

إنشاء تنبيه

اقتباس

بحث متقدم

تم حفظ المقالة في مكتبتي.

NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with...

Recent advances in speech language models: A survey‏

Conferencingspeech 2022 challenge: Non-intrusive objective speech quality assessment (NISQA) challenge for online conferencing applications‏

The VoiceMOS Challenge 2023: zero-shot subjective speech quality prediction for multiple domains‏

Torchaudio-squim: Reference-less speech quality and intelligibility measures in torchaudio‏

{SMACK}: Semantically Meaningful Adversarial Audio Attack‏

ICASSP 2024 speech signal improvement challenge‏

Antifake: Using adversarial audio to prevent unauthorized speech synthesis‏

When evil calls: Targeted adversarial voice over ip network‏

Speech quality assessment through MOS using non-matching references‏

Can I hear your face? pervasive attack on voice authentication systems with a single face image‏