State of the art on diffusion models for visual computing

R Po, W Yifan, V Golyanik, K Aberman… - Computer Graphics …, 2024 - Wiley Online Library
The field of visual computing is rapidly advancing due to the emergence of generative
artificial intelligence (AI), which unlocks unprecedented capabilities for the generation …

Next-gpt: Any-to-any multimodal llm

S Wu, H Fei, L Qu, W Ji, TS Chua - arxiv preprint arxiv:2309.05519, 2023 - arxiv.org
While recently Multimodal Large Language Models (MM-LLMs) have made exciting strides,
they mostly fall prey to the limitation of only input-side multimodal understanding, without the …

Dynamicrafter: Animating open-domain images with video diffusion priors

J **ng, M **a, Y Zhang, H Chen, W Yu, H Liu… - … on Computer Vision, 2024 - Springer
Animating a still image offers an engaging visual experience. Traditional image animation
techniques mainly focus on animating natural scenes with stochastic dynamics (eg clouds …

Svdiff: Compact parameter space for diffusion fine-tuning

L Han, Y Li, H Zhang, P Milanfar… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, diffusion models have achieved remarkable success in text-to-image generation,
enabling the creation of high-quality images from text prompts and various conditions …

Instantbooth: Personalized text-to-image generation without test-time finetuning

J Shi, W **ong, Z Lin, HJ Jung - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Recent advances in personalized image generation have enabled pre-trained text-to-image
models to learn new concepts from specific image sets. However these methods often …

Hyperdreambooth: Hypernetworks for fast personalization of text-to-image models

N Ruiz, Y Li, V Jampani, W Wei, T Hou… - Proceedings of the …, 2024 - openaccess.thecvf.com
Personalization has emerged as a prominent aspect within the field of generative AI
enabling the synthesis of individuals in diverse contexts and styles while retaining high …

Photomaker: Customizing realistic human photos via stacked id embedding

Z Li, M Cao, X Wang, Z Qi… - Proceedings of the …, 2024 - openaccess.thecvf.com
Recent advances in text-to-image generation have made remarkable progress in
synthesizing realistic human photos conditioned on given text prompts. However existing …

Break-a-scene: Extracting multiple concepts from a single image

O Avrahami, K Aberman, O Fried, D Cohen-Or… - SIGGRAPH Asia 2023 …, 2023 - dl.acm.org
Text-to-image model personalization aims to introduce a user-provided concept to the
model, allowing its synthesis in diverse contexts. However, current methods primarily focus …

Subject-diffusion: Open domain personalized text-to-image generation without test-time fine-tuning

J Ma, J Liang, C Chen, H Lu - ACM SIGGRAPH 2024 Conference …, 2024 - dl.acm.org
Recent progress in personalized image generation using diffusion models has been
significant. However, development in the area of open-domain and test-time fine-tuning-free …

Mix-of-show: Decentralized low-rank adaptation for multi-concept customization of diffusion models

Y Gu, X Wang, JZ Wu, Y Shi, Y Chen… - Advances in …, 2024 - proceedings.neurips.cc
Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained
significant attention from the community. These models can be easily customized for new …