Media forensics and deepfakes: an overview

L Verdoliva - IEEE journal of selected topics in signal …, 2020 - ieeexplore.ieee.org
With the rapid progress in recent years, techniques that generate and manipulate
multimedia content can now provide a very advanced level of realism. The boundary …

A comprehensive overview of Deepfake: Generation, detection, datasets, and opportunities

JW Seow, MK Lim, RCW Phan, JK Liu - Neurocomputing, 2022 - Elsevier
When used maliciously, deepfake can pose detrimental implications to political and social
forces including reducing public trust in institutions, damaging the reputation of prominent …

Structure and content-guided video synthesis with diffusion models

P Esser, J Chiu, P Atighehchian… - Proceedings of the …, 2023 - openaccess.thecvf.com
Text-guided generative diffusion models unlock powerful image creation and editing tools.
Recent approaches that edit the content of footage while retaining structure require …

Pix2video: Video editing using image diffusion

D Ceylan, CHP Huang, NJ Mitra - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Image diffusion models, trained on massive image collections, have emerged as the most
versatile image generator model in terms of quality and diversity. They support inverting real …

Sadtalker: Learning realistic 3d motion coefficients for stylized audio-driven single image talking face animation

W Zhang, X Cun, X Wang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Generating talking head videos through a face image and a piece of speech audio still
contains many challenges. ie, unnatural head movement, distorted expression, and identity …

Conditional image-to-video generation with latent flow diffusion models

H Ni, C Shi, K Li, SX Huang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Conditional image-to-video (cI2V) generation aims to synthesize a new plausible video
starting from an image (eg, a person's face) and a condition (eg, an action class label like …

Motionctrl: A unified and flexible motion controller for video generation

Z Wang, Z Yuan, X Wang, Y Li, T Chen, M **a… - ACM SIGGRAPH 2024 …, 2024 - dl.acm.org
Motions in a video primarily consist of camera motion, induced by camera movement, and
object motion, resulting from object movement. Accurate control of both camera and object …

Follow your pose: Pose-guided text-to-video generation using pose-free videos

Y Ma, Y He, X Cun, X Wang, S Chen, X Li… - Proceedings of the AAAI …, 2024 - ojs.aaai.org
Generating text-editable and pose-controllable character videos have an imperious demand
in creating various digital human. Nevertheless, this task has been restricted by the absence …

Disco: Disentangled control for realistic human dance generation

T Wang, L Li, K Lin, Y Zhai, CC Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Generative AI has made significant strides in computer vision particularly in text-driven
image/video synthesis (T2I/T2V). Despite the notable advancements it remains challenging …

Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset

Z Zhang, L Li, Y Ding, C Fan - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
One-shot talking face generation should synthesize high visual quality facial videos with
reasonable animations of expression and head pose, and just utilize arbitrary driving audio …