Champ: Controllable and consistent human image animation with 3d parametric guidance
In this study, we introduce a methodology for human image animation by leveraging a 3D
human parametric model within a latent diffusion framework to enhance shape alignment …
human parametric model within a latent diffusion framework to enhance shape alignment …
Animate anyone: Consistent and controllable image-to-video synthesis for character animation
L Hu - Proceedings of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
Character Animation aims to generating character videos from still images through driving
signals. Currently diffusion models have become the mainstream in visual generation …
signals. Currently diffusion models have become the mainstream in visual generation …
Grounded sam: Assembling open-world models for diverse visual tasks
We introduce Grounded SAM, which uses Grounding DINO as an open-set object detector to
combine with the segment anything model (SAM). This integration enables the detection and …
combine with the segment anything model (SAM). This integration enables the detection and …
Mofa-video: Controllable image animation via generative motion field adaptions in frozen image-to-video diffusion model
We present MOFA-Video, an advanced controllable image animation method that generates
video from the given image using various additional controllable signals (such as human …
video from the given image using various additional controllable signals (such as human …
Physavatar: Learning the physics of dressed 3d avatars from visual observations
Modeling and rendering photorealistic avatars is of crucial importance in many applications.
Existing methods that build a 3D avatar from visual observations, however, struggle to …
Existing methods that build a 3D avatar from visual observations, however, struggle to …
Follow-your-emoji: Fine-controllable and expressive freestyle portrait animation
We present Follow-Your-Emoji, a diffusion-based framework for portrait animation, which
animates a reference portrait with target landmark sequences. The main challenge of portrait …
animates a reference portrait with target landmark sequences. The main challenge of portrait …
Sapiens: Foundation for human vision models
We present Sapiens, a family of models for four fundamental human-centric vision tasks–2D
pose estimation, body-part segmentation, depth estimation, and surface normal prediction …
pose estimation, body-part segmentation, depth estimation, and surface normal prediction …
Wear-any-way: Manipulable virtual try-on via sparse correspondence alignment
This paper introduces a novel framework for virtual try-on, termed Wear-Any-Way. Different
from previous methods, Wear-Any-Way is a customizable solution. Besides generating high …
from previous methods, Wear-Any-Way is a customizable solution. Besides generating high …
Neural interactive keypoint detection
This work proposes an end-to-end neural interactive keypoint detection framework named
Click-Pose, which can significantly reduce more than 10 times labeling costs of 2D keypoint …
Click-Pose, which can significantly reduce more than 10 times labeling costs of 2D keypoint …
Mimicmotion: High-quality human motion video generation with confidence-aware pose guidance
Y Zhang, J Gu, LW Wang, H Wang, J Cheng… - arxiv preprint arxiv …, 2024 - arxiv.org
In recent years, generative artificial intelligence has achieved significant advancements in
the field of image generation, spawning a variety of applications. However, video generation …
the field of image generation, spawning a variety of applications. However, video generation …