State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Align your gaussians: Text-to-4d with dynamic 3d gaussians and composed diffusion models

H Ling, SW Kim, A Torralba… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-guided diffusion models have revolutionized image and video generation and have
also been successfully used for optimization-based 3D object synthesis. Here we instead …

Nifty: Neural object interaction fields for guided human motion synthesis

N Kulkarni, D Rempe, K Genova… - Proceedings of the …, 2024 - openaccess.thecvf.com
We address the problem of generating realistic 3D motions of humans interacting with
objects in a scene. Our key idea is to create a neural interaction field attached to a specific …

A systematic survey of prompt engineering on vision-language foundation models

J Gu, Z Han, S Chen, A Beirami, B He, G Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Prompt engineering is a technique that involves augmenting a large pre-trained model with
task-specific hints, known as prompts, to adapt the model to new tasks. Prompts can be …

Controllable human-object interaction synthesis

J Li, A Clegg, R Mottaghi, J Wu, X Puig… - European Conference on …, 2024 - Springer
Synthesizing semantic-aware, long-horizon, human-object interaction is critical to simulate
realistic human behaviors. In this work, we address the challenging problem of generating …

Cg-hoi: Contact-guided 3d human-object interaction generation

C Diller, A Dai - Proceedings of the IEEE/CVF Conference …, 2024 - openaccess.thecvf.com
We propose CG-HOI the first method to address the task of generating dynamic 3D human-
object interactions (HOIs) from text. We model the motion of both human and object in an …

Generating human interaction motions in scenes with text control

H Yi, J Thies, MJ Black, XB Peng, D Rempe - European Conference on …, 2024 - Springer
We present TeSMo, a text-controlled scene-aware motion generation method based on
denoising diffusion models. Previous text-to-motion methods focus on characters in isolation …

4dgen: Grounded 4d content generation with spatial-temporal consistency

Y Yin, D Xu, Z Wang, Y Zhao, Y Wei - arxiv preprint arxiv:2312.17225, 2023 - arxiv.org
Aided by text-to-image and text-to-video diffusion models, existing 4D content creation
pipelines utilize score distillation sampling to optimize the entire dynamic 3D scene …

Omnicontrol: Control any joint at any time for human motion generation

Y **e, V Jampani, L Zhong, D Sun, H Jiang - arxiv preprint arxiv …, 2023 - arxiv.org
We present a novel approach named OmniControl for incorporating flexible spatial control
signals into a text-conditioned human motion generation model based on the diffusion …

TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild Videos

Y Wang, Z Wang, L Liu, K Daniilidis - European Conference on Computer …, 2024 - Springer
We propose TRAM, a two-stage method to reconstruct a human's global trajectory and
motion from in-the-wild videos. TRAM robustifies SLAM to recover the camera motion in the …