Making images real again: A comprehensive survey on deep image composition

L Niu, W Cong, L Liu, Y Hong, B Zhang, J Liang… - arxiv preprint arxiv …, 2021 - arxiv.org
As a common image editing operation, image composition aims to combine the foreground
from one image and another background image, resulting in a composite image. However …

Uni-controlnet: All-in-one control to text-to-image diffusion models

S Zhao, D Chen, YC Chen, J Bao… - Advances in …, 2023 - proceedings.neurips.cc
Text-to-Image diffusion models have made tremendous progress over the past two years,
enabling the generation of highly realistic images based on open-domain text descriptions …

Videocomposer: Compositional video synthesis with motion controllability

X Wang, H Yuan, S Zhang, D Chen… - Advances in …, 2023 - proceedings.neurips.cc
The pursuit of controllability as a higher standard of visual content creation has yielded
remarkable progress in customizable image synthesis. However, achieving controllable …

A survey of multimodal-guided image editing with text-to-image diffusion models

X Shuai, H Ding, X Ma, R Tu, YG Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org
Image editing aims to edit the given synthetic or real image to meet the specific requirements
from users. It is widely studied in recent years as a promising and challenging field of …

Photomaker: Customizing realistic human photos via stacked id embedding

Z Li, M Cao, X Wang, Z Qi… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advances in text-to-image generation have made remarkable progress in
synthesizing realistic human photos conditioned on given text prompts. However existing …

Raphael: Text-to-image generation via large mixture of diffusion paths

Z Xue, G Song, Q Guo, B Liu, Z Zong… - Advances in Neural …, 2023 - proceedings.neurips.cc
Text-to-image generation has recently witnessed remarkable achievements. We introduce a
text-conditional image diffusion model, termed RAPHAEL, to generate highly artistic images …

Dreamvideo: Composing your dream videos with customized subject and motion

Y Wei, S Zhang, Z Qing, H Yuan, Z Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Customized generation using diffusion models has made impressive progress in image
generation but remains unsatisfactory in the challenging video generation task as it requires …

Motiondirector: Motion customization of text-to-video diffusion models

R Zhao, Y Gu, JZ Wu, DJ Zhang, JW Liu, W Wu… - … on Computer Vision, 2024 - Springer
Large-scale pre-trained diffusion models have exhibited remarkable capabilities in diverse
video generations. Given a set of video clips of the same motion concept, the task of Motion …

Videobooth: Diffusion-based video generation with image prompts

Y Jiang, T Wu, S Yang, C Si, D Lin… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-driven video generation witnesses rapid progress. However merely using text prompts
is not enough to depict the desired subject appearance that accurately aligns with users' …

Momentdiff: Generative video moment retrieval from random to real

P Li, CW **e, H **e, L Zhao, L Zhang… - Advances in neural …, 2023 - proceedings.neurips.cc
Video moment retrieval pursues an efficient and generalized solution to identify the specific
temporal segments within an untrimmed video that correspond to a given language …