Diffusion model-based image editing: A survey
Denoising diffusion models have emerged as a powerful tool for various image generation
and editing tasks, facilitating the synthesis of visual content in an unconditional or input …
and editing tasks, facilitating the synthesis of visual content in an unconditional or input …
The (r) evolution of multimodal large language models: A survey
Connecting text and visual modalities plays an essential role in generative intelligence. For
this reason, inspired by the success of large language models, significant research efforts …
this reason, inspired by the success of large language models, significant research efforts …
Imprint: Generative object compositing by learning identity-preserving representation
Generative object compositing emerges as a promising new avenue for compositional
image editing. However the requirement of object identity preservation poses a significant …
image editing. However the requirement of object identity preservation poses a significant …
Mp5: A multi-modal open-ended embodied system in minecraft via active perception
It is a long-lasting goal to design an embodied system that can solve long-horizon open-
world tasks in human-like ways. However, existing approaches usually struggle with …
world tasks in human-like ways. However, existing approaches usually struggle with …
Efficient diffusion models: A comprehensive survey from principles to practices
Z Ma, Y Zhang, G Jia, L Zhao, Y Ma, M Ma… - arxiv preprint arxiv …, 2024 - arxiv.org
As one of the most popular and sought-after generative models in the recent years, diffusion
models have sparked the interests of many researchers and steadily shown excellent …
models have sparked the interests of many researchers and steadily shown excellent …
EditShield: Protecting Unauthorized Image Editing by Instruction-Guided Diffusion Models
Text-to-image diffusion models have emerged as an evolutionary for producing creative
content in image synthesis. Based on the impressive generation abilities of these models …
content in image synthesis. Based on the impressive generation abilities of these models …
Genartist: Multimodal llm as an agent for unified image generation and editing
Despite the success achieved by existing image generation and editing methods, current
models still struggle with complex problems including intricate text prompts, and the …
models still struggle with complex problems including intricate text prompts, and the …
Attentive Eraser: Unleashing Diffusion Model's Object Removal Potential via Self-Attention Redirection Guidance
W Sun, B Cui, J Tang, XM Dong - arxiv preprint arxiv:2412.12974, 2024 - arxiv.org
Recently, diffusion models have emerged as promising newcomers in the field of generative
models, shining brightly in image generation. However, when employed for object removal …
models, shining brightly in image generation. However, when employed for object removal …
F-HOI: Toward Fine-grained Semantic-Aligned 3D Human-Object Interactions
Existing 3D human object interaction (HOI) datasets and models simply align global
descriptions with the long HOI sequence, while lacking a detailed understanding of …
descriptions with the long HOI sequence, while lacking a detailed understanding of …
Unifiedmllm: Enabling unified representation for multi-modal multi-tasks with large language model
Significant advancements has recently been achieved in the field of multi-modal large
language models (MLLMs), demonstrating their remarkable capabilities in understanding …
language models (MLLMs), demonstrating their remarkable capabilities in understanding …