Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Dataset diffusion: Diffusion-based synthetic data generation for pixel-level semantic segmentation
Preparing training data for deep vision models is a labor-intensive task. To address this,
generative models have emerged as an effective solution for generating synthetic data …
generative models have emerged as an effective solution for generating synthetic data …
Understanding the latent space of diffusion models through the lens of riemannian geometry
Despite the success of diffusion models (DMs), we still lack a thorough understanding of
their latent space. To understand the latent space $\mathbf {x} _t\in\mathcal {X} $, we …
their latent space. To understand the latent space $\mathbf {x} _t\in\mathcal {X} $, we …
Uncovering prototypical knowledge for weakly open-vocabulary semantic segmentation
This paper studies the problem of weakly open-vocabulary semantic segmentation
(WOVSS), which learns to segment objects of arbitrary classes using mere image-text pairs …
(WOVSS), which learns to segment objects of arbitrary classes using mere image-text pairs …
Diffusion models for zero-shot open-vocabulary segmentation
The variety of objects in the real world is nearly unlimited and is thus impossible to capture
using models trained on a fixed set of categories. As a result, in recent years, open …
using models trained on a fixed set of categories. As a result, in recent years, open …
Lexicon3d: Probing visual foundation models for complex 3d scene understanding
Complex 3D scene understanding has gained increasing attention, with scene encoding
strategies built on top of visual foundation models playing a crucial role in this success …
strategies built on top of visual foundation models playing a crucial role in this success …
Tokencompose: Text-to-image diffusion with token-level supervision
Abstract We present TokenCompose a Latent Diffusion Model for text-to-image generation
that achieves enhanced consistency between user-specified text prompts and model …
that achieves enhanced consistency between user-specified text prompts and model …
Attrseg: open-vocabulary semantic segmentation via attribute decomposition-aggregation
Open-vocabulary semantic segmentation is a challenging task that requires segmenting
novel object categories at inference time. Recent works explore vision-language pre-training …
novel object categories at inference time. Recent works explore vision-language pre-training …
Distilling vision-language pre-training to collaborate with weakly-supervised temporal action localization
Weakly-supervised temporal action localization (WTAL) learns to detect and classify action
instances with only category labels. Most methods widely adopt the off-the-shelf …
instances with only category labels. Most methods widely adopt the off-the-shelf …
Turbo: Informativity-driven acceleration plug-in for vision-language large models
Abstract Vision-Language Large Models (VLMs) recently become primary backbone of AI,
due to the impressive performance. However, their expensive computation costs, ie …
due to the impressive performance. However, their expensive computation costs, ie …
Unigs: Unified representation for image generation and segmentation
This paper introduces a novel unified representation of diffusion models for image
generation and segmentation. Specifically we use a colormap to represent entity-level …
generation and segmentation. Specifically we use a colormap to represent entity-level …