Diffusion models in vision: A survey

FA Croitoru, V Hondru, RT Ionescu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Denoising diffusion models represent a recent emerging topic in computer vision,
demonstrating remarkable results in the area of generative modeling. A diffusion model is a …

Universal guidance for diffusion models

A Bansal, HM Chu, A Schwarzschild… - Proceedings of the …, 2023 - openaccess.thecvf.com
Typical diffusion models are trained to accept a particular form of conditioning, most
commonly text, and cannot be conditioned on other modalities without retraining. In this …

Multidiffusion: Fusing diffusion paths for controlled image generation

O Bar-Tal, L Yariv, Y Lipman, T Dekel - 2023 - openreview.net
Recent advances in text-to-image generation with diffusion models present transformative
capabilities in image quality. However, user controllability of the generated image, and fast …

Collaborative diffusion for multi-modal face generation and editing

Z Huang, KCK Chan, Y Jiang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Diffusion models arise as a powerful generative tool recently. Despite the great progress,
existing diffusion models mainly focus on uni-modal control, ie, the diffusion process is …

Multimodal image synthesis and editing: The generative AI era

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

[PDF][PDF] Multimodal image synthesis and editing: A survey

F Zhan, Y Yu, R Wu, J Zhang, S Lu, L Liu… - arxiv preprint arxiv …, 2022 - pure.mpg.de
As information exists in various modalities in real world, effective interaction and fusion
among multimodal information plays a key role for the creation and perception of multimodal …

Freemask: Synthetic images with dense annotations make stronger segmentation models

L Yang, X Xu, B Kang, Y Shi… - Advances in Neural …, 2023 - proceedings.neurips.cc
Semantic segmentation has witnessed tremendous progress due to the proposal of various
advanced network architectures. However, they are extremely hungry for delicate …

Freestyle layout-to-image synthesis

H Xue, Z Huang, Q Sun, L Song… - Proceedings of the …, 2023 - openaccess.thecvf.com
Typical layout-to-image synthesis (LIS) models generate images for a closed set of semantic
classes, eg, 182 common objects in COCO-Stuff. In this work, we explore the freestyle …

Zero-shot spatial layout conditioning for text-to-image diffusion models

G Couairon, M Careil, M Cord… - Proceedings of the …, 2023 - openaccess.thecvf.com
Large-scale text-to-image diffusion models have significantly improved the state of the art in
generative image modeling and allow for an intuitive and powerful user interface to drive the …

Generative semantic communication: Diffusion models beyond bit recovery

E Grassucci, S Barbarossa, D Comminiello - arxiv preprint arxiv …, 2023 - arxiv.org
Semantic communication is expected to be one of the cores of next-generation AI-based
communications. One of the possibilities offered by semantic communication is the capability …