Rb-modulation: Training-free personalization of diffusion models using stochastic optimal control

L Rout, Y Chen, N Ruiz, A Kumar, C Caramanis… - arxiv preprint arxiv …, 2024‏ - arxiv.org
We propose Reference-Based Modulation (RB-Modulation), a new plug-and-play solution
for training-free personalization of diffusion models. Existing training-free approaches exhibit …

Instantstyle-plus: Style transfer with content-preserving in text-to-image generation

H Wang, P **ng, R Huang, H Ai, Q Wang… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Style transfer is an inventive process designed to create an image that maintains the
essence of the original while embracing the visual style of another. Although diffusion …

StyleTex: Style Image-Guided Texture Generation for 3D Models

Z **e, Y Zhang, X Tang, Y Wu, D Chen, G Li… - ACM Transactions on …, 2024‏ - dl.acm.org
Style-guided texture generation aims to generate a texture that is harmonious with both the
style of the reference image and the geometry of the input mesh, given a reference style …

From parts to whole: A unified reference framework for controllable human image generation

Z Huang, H Fan, L Wang, L Sheng - arxiv preprint arxiv:2404.15267, 2024‏ - arxiv.org
Recent advancements in controllable human image generation have led to zero-shot
generation using structural signals (eg, pose, depth) or facial appearance. Yet, generating …

MagicTailor: Component-Controllable Personalization in Text-to-Image Diffusion Models

D Zhou, J Huang, J Bai, J Wang, H Chen… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Recent advancements in text-to-image (T2I) diffusion models have enabled the creation of
high-quality images from text prompts, but they still struggle to generate images with precise …

Dream-in-Style: Text-to-3D Generation Using Stylized Score Distillation

H Kompanowski, BS Hua - arxiv preprint arxiv:2406.18581, 2024‏ - arxiv.org
We present a method to generate 3D objects in styles. Our method takes a text prompt and a
style reference image as input and reconstructs a neural radiance field to synthesize a 3D …

AttenCraft: Attention-guided Disentanglement of Multiple Concepts for Text-to-Image Customization

J Shentu, M Watson, NA Moubayed - arxiv preprint arxiv:2405.17965, 2024‏ - arxiv.org
With the unprecedented performance being achieved by text-to-image (T2I) diffusion
models, T2I customization further empowers users to tailor the diffusion model to new …

FAM Diffusion: Frequency and Attention Modulation for High-Resolution Image Generation with Stable Diffusion

H Yang, A Bulat, I Hadji, HX Pham, X Zhu… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Diffusion models are proficient at generating high-quality images. They are however
effective only when operating at the resolution used during training. Inference at a scaled …

Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting

Y Zhang, M Luo, W Dong, X Yang, H Huang… - arxiv preprint arxiv …, 2025‏ - arxiv.org
The stories and characters that captivate us as we grow up shape unique fantasy worlds,
with images serving as the primary medium for visually experiencing these realms …

ReEdit: Multimodal Exemplar-Based Image Editing with Diffusion Models

A Srivastava, TR Menta, A Java, A Jadhav… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Modern Text-to-Image (T2I) Diffusion models have revolutionized image editing by enabling
the generation of high-quality photorealistic images. While the de facto method for …