محقق Google

A Firc, K Malinka, P Hanáček - Heliyon, 2023‏ - cell.com‏

Deepfakes present an emerging threat in cyberspace. Recent developments in machine
learning make deepfakes highly believable, and very difficult to differentiate between what is …‏

ذخیره ارجاع بیان شده در 36 یافته مقاله‌های مربوط تمام نسخه‌های 7

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Usat: A universal speaker-adaptive text-to-speech approach‏

W Wang, Y Song, S Jha - IEEE/ACM Transactions on Audio …, 2024‏ - ieeexplore.ieee.org‏

Conventional text-to-speech (TTS) research has predominantly focused on enhancing the
quality of synthesized speech for speakers in the training dataset. The challenge of …‏

ذخیره ارجاع بیان شده در 11 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tdass: Target domain adaptation speech synthesis framework for multi-speaker low-resource tts‏

X Zhang, J Wang, N Cheng… - 2022 International Joint …, 2022‏ - ieeexplore.ieee.org‏

Recently, synthesizing personalized speech by text-to-speech (TTS) application is highly
demanded. But the previous TTS models require a mass of target speaker speeches for …‏

ذخیره ارجاع بیان شده در 16 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Metasid: Singer identification with domain adaptation for metaverse‏

X Zhang, J Wang, N Cheng… - 2022 International Joint …, 2022‏ - ieeexplore.ieee.org‏

Metaverse has stretched the real world into unlimited space. There will be more live concerts
in Metaverse. The task of singer identification is to identify the song belongs to which singer …‏

ذخیره ارجاع بیان شده در 15 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Adaptive transformer-based conditioned variational autoencoder for incomplete social event classification‏

Z Li, S Qian, J Cao, Q Fang, C Xu - Proceedings of the 30th ACM …, 2022‏ - dl.acm.org‏

With the rapid development of the Internet and the expanding scale of social media,
incomplete social event classification has increasingly become a challenging task. The key …‏

ذخیره ارجاع بیان شده در 9 یافته مقاله‌های مربوط

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Susing: Su-net for singing voice synthesis‏

X Zhang, J Wang, N Cheng… - 2022 International Joint …, 2022‏ - ieeexplore.ieee.org‏

Singing voice synthesis is a generative task that involves multi-dimensional control of the
singing model, including lyrics, pitch, and duration, and includes the timbre of the singer and …‏

ذخیره ارجاع بیان شده در 14 یافته مقاله‌های مربوط تمام نسخه‌های 4

[Free GPT-4]
[DeepSeek]

[PDF] isca-archive.org

[PDF][PDF] Fvtts: Face based voice synthesis for text-to-speech‏

M Lee, E Park, S Hong - Proc. Interspeech 2024, 2024‏ - isca-archive.org‏

A face is expressive of individual identity and used in various studies such as identification,
authentication, and personalization. Similarly, a voice is a means of expressing individuals …‏

ذخیره ارجاع بیان شده در 2 یافته مقاله‌های مربوط تمام نسخه‌های 3 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Pose guided human image synthesis with partially decoupled gan‏

J Wu, S Si, J Wang, X Qu, X **g - Asian Conference on …, 2023‏ - proceedings.mlr.press‏

Abstract Pose Guided Human Image Synthesis (PGHIS) is a challenging task of transforming
a human image from the reference pose to a target pose while preserving its style. Most …‏

ذخیره ارجاع بیان شده در 4 یافته مقاله‌های مربوط تمام نسخه‌های 4 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Semi-supervised learning based on reference model for low-resource tts‏

X Zhang, J Wang, N Cheng… - 2022 18th International …, 2022‏ - ieeexplore.ieee.org‏

Most previous neural text-to-speech (TTS) methods are mainly based on supervised
learning methods, which means they depend on a large training dataset and hard to achieve …‏

ذخیره ارجاع بیان شده در 6 یافته مقاله‌های مربوط تمام نسخه‌های 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Mdcnn-sid: Multi-scale dilated convolution network for singer identification‏

X Zhang, J Wang, N Cheng… - 2022 International Joint …, 2022‏ - ieeexplore.ieee.org‏

Most singer identification methods are processed in the frequency domain, which potentially
leads to information loss during the spectral transformation. In this paper, instead of the …‏

ذخیره ارجاع بیان شده در 10 یافته مقاله‌های مربوط تمام نسخه‌های 3

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

nnspeech: Speaker-guided conditional variational autoencoder for zero-shot multi-speaker...

Deepfakes as a threat to a speaker and facial recognition: An overview of tools and attack vectors‏

Usat: A universal speaker-adaptive text-to-speech approach‏

Tdass: Target domain adaptation speech synthesis framework for multi-speaker low-resource tts‏

Metasid: Singer identification with domain adaptation for metaverse‏

Adaptive transformer-based conditioned variational autoencoder for incomplete social event classification‏

Susing: Su-net for singing voice synthesis‏

[PDF][PDF] Fvtts: Face based voice synthesis for text-to-speech‏

Pose guided human image synthesis with partially decoupled gan‏

Semi-supervised learning based on reference model for low-resource tts‏

Mdcnn-sid: Multi-scale dilated convolution network for singer identification‏