Open-vocabulary panoptic segmentation with text-to-image diffusion models
We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …
Instructpix2pix: Learning to follow image editing instructions
We propose a method for editing images from human instructions: given an input image and
a written instruction that tells the model what to do, our model follows these instructions to …
a written instruction that tells the model what to do, our model follows these instructions to …
Synthetic data from diffusion models improves imagenet classification
Deep generative models are becoming increasingly powerful, now generating diverse high
fidelity photo-realistic samples given text prompts. Have they reached the point where …
fidelity photo-realistic samples given text prompts. Have they reached the point where …
Diffumask: Synthesizing images with pixel-level annotations for semantic segmentation using diffusion models
Collecting and annotating images with pixel-wise labels is time-consuming and laborious. In
contrast, synthetic data can be freely available using a generative model (eg, DALL-E …
contrast, synthetic data can be freely available using a generative model (eg, DALL-E …
Fake it till you make it: Learning transferable representations from synthetic imagenet clones
Recent image generation models such as Stable Diffusion have exhibited an impressive
ability to generate fairly realistic images starting from a simple text prompt. Could such …
ability to generate fairly realistic images starting from a simple text prompt. Could such …
Datasetdm: Synthesizing data with perception annotations using diffusion models
Current deep networks are very data-hungry and benefit from training on large-scale
datasets, which are often time-consuming to collect and annotate. By contrast, synthetic data …
datasets, which are often time-consuming to collect and annotate. By contrast, synthetic data …
Dataset diffusion: Diffusion-based synthetic data generation for pixel-level semantic segmentation
Preparing training data for deep vision models is a labor-intensive task. To address this,
generative models have emerged as an effective solution for generating synthetic data …
generative models have emerged as an effective solution for generating synthetic data …
Freemask: Synthetic images with dense annotations make stronger segmentation models
Semantic segmentation has witnessed tremendous progress due to the proposal of various
advanced network architectures. However, they are extremely hungry for delicate …
advanced network architectures. However, they are extremely hungry for delicate …
[PDF][PDF] Convolution-Transformer for Image Feature Extraction.
This study addresses the limitations of Transformer models in image feature extraction,
particularly their lack of inductive bias for visual structures. Compared to Convolutional …
particularly their lack of inductive bias for visual structures. Compared to Convolutional …
Open-vocabulary object segmentation with diffusion models
The goal of this paper is to extract the visual-language correspondence from a pre-trained
text-to-image diffusion model, in the form of segmentation map, ie, simultaneously …
text-to-image diffusion model, in the form of segmentation map, ie, simultaneously …