- Academic Search

Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward

M Masood, M Nawaz, KM Malik, A Javed, A Irtaza… - Applied …, 2023 - Springer

Easy access to audio-visual content on social media, combined with the availability of
modern tools such as Tensorflow or Keras, and open-source trained models, along with …

Opslaan Citeren Geciteerd door 431 Verwante artikelen Alle 11 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A review on methods and applications in multimodal deep learning

S Jabeen, X Li, MS Amin, O Bourahla, S Li… - ACM Transactions on …, 2023 - dl.acm.org

Deep Learning has implemented a wide range of applications and has become increasingly
popular in recent years. The goal of multimodal deep learning (MMDL) is to create models …

Opslaan Citeren Geciteerd door 101 Verwante artikelen Alle 7 versies

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

Opslaan Citeren Geciteerd door 249 Verwante artikelen Alle 8 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Vasa-1: Lifelike audio-driven talking faces generated in real time

S Xu, G Chen, YX Guo, J Yang, C Li… - Advances in …, 2025 - proceedings.neurips.cc

We introduce VASA, a framework for generating lifelike talking faces with appealing visual
affective skills (VAS) given a single static image and a speech audio clip. Our premiere …

Opslaan Citeren Geciteerd door 61 Verwante artikelen Alle 5 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Codetalker: Speech-driven 3d facial animation with discrete motion prior

J **ng, M **a, Y Zhang, X Cun… - Proceedings of the …, 2023 - openaccess.thecvf.com

Speech-driven 3D facial animation has been widely studied, yet there is still a gap to
achieving realism and vividness due to the highly ill-posed nature and scarcity of audio …

Opslaan Citeren Geciteerd door 151 Verwante artikelen Alle 11 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Diffused heads: Diffusion models beat gans on talking-face generation

M Stypułkowski, K Vougioukas, S He… - Proceedings of the …, 2024 - openaccess.thecvf.com

Talking face generation has historically struggled to produce head movements and natural
facial expressions without guidance from additional reference videos. Recent developments …

Opslaan Citeren Geciteerd door 135 Verwante artikelen Alle 6 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Difftalk: Crafting diffusion models for generalized audio-driven portraits animation

S Shen, W Zhao, Z Meng, W Li… - Proceedings of the …, 2023 - openaccess.thecvf.com

Talking head synthesis is a promising approach for the video production industry. Recently,
a lot of effort has been devoted in this research area to improve the generation quality or …

Opslaan Citeren Geciteerd door 108 Verwante artikelen Alle 8 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Seeing what you said: Talking face generation guided by a lip reading expert

J Wang, X Qian, M Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Talking face generation, also known as speech-to-lip generation, reconstructs facial motions
concerning lips given coherent speech input. The previous studies revealed the importance …

Opslaan Citeren Geciteerd door 85 Verwante artikelen Alle 6 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] mpg.de

[PDF][PDF] Multimodal image synthesis and editing: A survey

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - arxiv preprint arxiv …, 2022 - pure.mpg.de

As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

Opslaan Citeren Geciteerd door 259 Verwante artikelen Alle 3 versies HTML-versie

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Ad-nerf: Audio driven neural radiance fields for talking head synthesis

Y Guo, K Chen, S Liang, YJ Liu… - Proceedings of the …, 2021 - openaccess.thecvf.com

Generating high-fidelity talking head video by fitting with the input audio sequence is a
challenging problem that receives considerable attentions recently. In this paper, we …

Opslaan Citeren Geciteerd door 409 Verwante artikelen Alle 8 versies HTML-versie

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Pose-controllable talking face generation by implicitly modularized audio-visual representation

Deepfakes generation and detection: State-of-the-art, open challenges, countermeasures, and way forward

A review on methods and applications in multimodal deep learning

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

Vasa-1: Lifelike audio-driven talking faces generated in real time

Codetalker: Speech-driven 3d facial animation with discrete motion prior

Diffused heads: Diffusion models beat gans on talking-face generation

Difftalk: Crafting diffusion models for generalized audio-driven portraits animation

Seeing what you said: Talking face generation guided by a lip reading expert

[PDF][PDF] Multimodal image synthesis and editing: A survey

Ad-nerf: Audio driven neural radiance fields for talking head synthesis