Open-vocabulary panoptic segmentation with text-to-image diffusion models

J Xu, S Liu, A Vahdat, W Byeon… - Proceedings of the …, 2023 - openaccess.thecvf.com
We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …

Instructpix2pix: Learning to follow image editing instructions

T Brooks, A Holynski, AA Efros - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
We propose a method for editing images from human instructions: given an input image and
a written instruction that tells the model what to do, our model follows these instructions to …

Synthetic data from diffusion models improves imagenet classification

S Azizi, S Kornblith, C Saharia, M Norouzi… - arxiv preprint arxiv …, 2023 - arxiv.org
Deep generative models are becoming increasingly powerful, now generating diverse high
fidelity photo-realistic samples given text prompts. Have they reached the point where …

Diffumask: Synthesizing images with pixel-level annotations for semantic segmentation using diffusion models

W Wu, Y Zhao, MZ Shou, H Zhou… - Proceedings of the …, 2023 - openaccess.thecvf.com
Collecting and annotating images with pixel-wise labels is time-consuming and laborious. In
contrast, synthetic data can be freely available using a generative model (eg, DALL-E …

Fake it till you make it: Learning transferable representations from synthetic imagenet clones

MB Sarıyıldız, K Alahari, D Larlus… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …

Datasetdm: Synthesizing data with perception annotations using diffusion models

W Wu, Y Zhao, H Chen, Y Gu, R Zhao… - Advances in …, 2023 - proceedings.neurips.cc
Current deep networks are very data-hungry and benefit from training on large-scale
datasets, which are often time-consuming to collect and annotate. By contrast, synthetic data …

Dataset diffusion: Diffusion-based synthetic data generation for pixel-level semantic segmentation

Q Nguyen, T Vu, A Tran… - Advances in Neural …, 2024 - proceedings.neurips.cc
Preparing training data for deep vision models is a labor-intensive task. To address this,
generative models have emerged as an effective solution for generating synthetic data …

Freemask: Synthetic images with dense annotations make stronger segmentation models

L Yang, X Xu, B Kang, Y Shi… - Advances in Neural …, 2024 - proceedings.neurips.cc
Semantic segmentation has witnessed tremendous progress due to the proposal of various
advanced network architectures. However, they are extremely hungry for delicate …

[PDF][PDF] Convolution-Transformer for Image Feature Extraction.

L Yin, L Wang, S Lu, R Wang, Y Yang… - … in Engineering & …, 2024 - researchgate.net
This study addresses the limitations of Transformer models in image feature extraction,
particularly their lack of inductive bias for visual structures. Compared to Convolutional …

Open-vocabulary object segmentation with diffusion models

Z Li, Q Zhou, X Zhang, Y Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com
The goal of this paper is to extract the visual-language correspondence from a pre-trained
text-to-image diffusion model, in the form of segmentation map, ie, simultaneously …