- Academic Search

Z Peng, W Hu, Y Shi, X Zhu, X Zhang… - Proceedings of the …, 2024‏ - openaccess.thecvf.com‏

Achieving high synchronization in the synthesis of realistic speech-driven talking head
videos presents a significant challenge. Traditional Generative Adversarial Networks (GAN) …‏

שמור צטט צוטט על ידי 40 מאמרים בנושא זה כל 7 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Facechain-imagineid: Freely crafting high-fidelity diverse talking faces from disentangled audio‏

C Xu, Y Liu, J **ng, W Wang, M Sun… - Proceedings of the …, 2024‏ - openaccess.thecvf.com‏

In this paper we abstract the process of people hearing speech extracting meaningful cues
and creating various dynamically audio-consistent talking faces termed Listening and …‏

שמור צטט צוטט על ידי 12 מאמרים בנושא זה כל 8 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Flowvqtalker: High-quality emotional talking face generation through normalizing flow and quantization‏

S Tan, B Ji, Y Pan - … of the IEEE/CVF Conference on …, 2024‏ - openaccess.thecvf.com‏

Generating emotional talking faces is a practical yet challenging endeavor. To create a
lifelike avatar we draw upon two critical insights from a human perspective: 1) The …‏

שמור צטט צוטט על ידי 14 מאמרים בנושא זה כל 6 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Deepfake generation and detection: A benchmark and survey‏

G Pei, J Zhang, M Hu, Z Zhang, C Wang, Y Wu… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

Deepfake is a technology dedicated to creating highly realistic facial images and videos
under specific conditions, which has significant application potential in fields such as …‏

שמור צטט צוטט על ידי 34 מאמרים בנושא זה כל 3 הגרסאות פתיחה בתור HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Enhancing visibility in nighttime haze images using guided apsf and gradient adaptive convolution‏

Y **, B Lin, W Yan, Y Yuan, W Ye, RT Tan - Proceedings of the 31st …, 2023‏ - dl.acm.org‏

Visibility in hazy nighttime scenes is frequently reduced by multiple factors, including low
light, intense glow, light scattering, and the presence of multicolored light sources. Existing …‏

שמור צטט צוטט על ידי 44 מאמרים בנושא זה כל 6 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Edtalk: Efficient disentanglement for emotional talking head synthesis‏

S Tan, B Ji, M Bi, Y Pan - European Conference on Computer Vision, 2024‏ - Springer‏

Achieving disentangled control over multiple facial motions and accommodating diverse
input modalities greatly enhances the application and entertainment of the talking head …‏

שמור צטט צוטט על ידי 17 מאמרים בנושא זה כל 4 הגרסאות

[Free GPT-4]
[DeepSeek]

[HTML] mdpi.com

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve‏

Z Akhtar, TL Pendyala, VS Athmakuri - Forensic Sciences, 2024‏ - mdpi.com‏

The revolutionary breakthroughs in Machine Learning (ML) and Artificial Intelligence (AI) are
extensively being harnessed across a diverse range of domains, eg, forensic science …‏

שמור צטט צוטט על ידי 8 מאמרים בנושא זה במטמון

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Selftalk: A self-supervised commutative training diagram to comprehend 3d talking faces‏

Z Peng, Y Luo, Y Shi, H Xu, X Zhu, H Liu, J He… - Proceedings of the 31st …, 2023‏ - dl.acm.org‏

Speech-driven 3D face animation technique, extending its applications to various
multimedia fields. Previous research has generated promising realistic lip movements and …‏

שמור צטט צוטט על ידי 38 מאמרים בנושא זה כל 3 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

AV-Deepfake1M: A large-scale LLM-driven audio-visual deepfake dataset‏

Z Cai, S Ghosh, AP Adatia, M Hayat, A Dhall… - Proceedings of the …, 2024‏ - dl.acm.org‏

The detection and localization of highly realistic deepfake audio-visual content are
challenging even for the most advanced state-of-the-art methods. While most of the research …‏

שמור צטט צוטט על ידי 27 מאמרים בנושא זה כל 4 הגרסאות

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Vlogger: Multimodal diffusion for embodied avatar synthesis‏

E Corona, A Zanfir, EG Bazavan, N Kolotouros… - arxiv preprint arxiv …, 2024‏ - arxiv.org‏

We propose VLOGGER, a method for audio-driven human video generation from a single
input image of a person, which builds on the success of recent generative diffusion models …‏

שמור צטט צוטט על ידי 17 מאמרים בנושא זה כל 4 הגרסאות פתיחה בתור HTML

יצירת התראה

צטט

חיפוש מתקדם

נשמר בספרייה שלי

Seeing what you said: Talking face generation guided by a lip reading expert

Synctalk: The devil is in the synchronization for talking head synthesis‏

Facechain-imagineid: Freely crafting high-fidelity diverse talking faces from disentangled audio‏

Flowvqtalker: High-quality emotional talking face generation through normalizing flow and quantization‏

Deepfake generation and detection: A benchmark and survey‏

Enhancing visibility in nighttime haze images using guided apsf and gradient adaptive convolution‏

Edtalk: Efficient disentanglement for emotional talking head synthesis‏

[HTML][HTML] Video and audio deepfake datasets and open issues in deepfake technology: being ahead of the curve‏

Selftalk: A self-supervised commutative training diagram to comprehend 3d talking faces‏

AV-Deepfake1M: A large-scale LLM-driven audio-visual deepfake dataset‏

Vlogger: Multimodal diffusion for embodied avatar synthesis‏