Champ: Controllable and consistent human image animation with 3d parametric guidance

S Zhu, JL Chen, Z Dai, Z Dong, Y Xu, X Cao… - … on Computer Vision, 2024 - Springer
In this study, we introduce a methodology for human image animation by leveraging a 3D
human parametric model within a latent diffusion framework to enhance shape alignment …

Animate anyone: Consistent and controllable image-to-video synthesis for character animation

L Hu - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Character Animation aims to generating character videos from still images through driving
signals. Currently diffusion models have become the mainstream in visual generation …

Diffusion action segmentation

D Liu, Q Li, AD Dinh, T Jiang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Temporal action segmentation is crucial for understanding long-form videos. Previous works
on this task commonly adopt an iterative refinement paradigm by using multi-stage models …

Prompt-free diffusion: Taking" text" out of text-to-image diffusion models

X Xu, J Guo, Z Wang, G Huang… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Text-to-image (T2I) research has grown explosively in the past year owing to the
large-scale pre-trained diffusion models and many emerging personalization and editing …

Ladi-vton: Latent diffusion textual-inversion enhanced virtual try-on

D Morelli, A Baldrati, G Cartella, M Cornia… - Proceedings of the 31st …, 2023 - dl.acm.org
The rapidly evolving fields of e-commerce and metaverse continue to seek innovative
approaches to enhance the consumer experience. At the same time, recent advancements …

Wear-any-way: Manipulable virtual try-on via sparse correspondence alignment

M Chen, X Chen, Z Zhai, C Ju, X Hong, J Lan… - … on Computer Vision, 2024 - Springer
This paper introduces a novel framework for virtual try-on, termed Wear-Any-Way. Different
from previous methods, Wear-Any-Way is a customizable solution. Besides generating high …

360-degree Human Video Generation with 4D Diffusion Transformer

R Shao, Y Pang, Z Zheng, J Sun, Y Liu - ACM Transactions on Graphics …, 2024 - dl.acm.org
We present a novel approach for generating 360-degree high-quality, spatiotemporally
coherent human videos from a single image. Our framework combines the strengths of …

Appearance and Pose-guided Human Generation: A Survey

F Liao, X Zou, W Wong - ACM Computing Surveys, 2024 - dl.acm.org
Appearance and pose-guided human generation is a burgeoning field that has captured
significant attention. This subject's primary objective is to transfer pose information from a …

Controllable person image synthesis with pose-constrained latent diffusion

X Han, X Zhu, J Deng, YZ Song… - Proceedings of the …, 2023 - openaccess.thecvf.com
Controllable person image synthesis aims at rendering a source image based on user-
specified changes in body pose or appearance. Prior art approaches leverage pixel-level …

Reuse and diffuse: Iterative denoising for text-to-video generation

J Gu, S Wang, H Zhao, T Lu, X Zhang, Z Wu… - arxiv preprint arxiv …, 2023 - arxiv.org
Inspired by the remarkable success of Latent Diffusion Models (LDMs) for image synthesis,
we study LDM for text-to-video generation, which is a formidable challenge due to the …