Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Prompt-aligned gradient for prompt tuning
Thanks to the large pre-trained vision-language models (VLMs) like CLIP, we can craft a
zero-shot classifier by discrete prompt design, eg, the confidence score of an image …
zero-shot classifier by discrete prompt design, eg, the confidence score of an image …
Tcp: Textual-based class-aware prompt tuning for visual-language model
Prompt tuning represents a valuable technique for adapting pre-trained visual-language
models (VLM) to various downstream tasks. Recent advancements in CoOp-based methods …
models (VLM) to various downstream tasks. Recent advancements in CoOp-based methods …
Balancing act: distribution-guided debiasing in diffusion models
Abstract Diffusion Models (DMs) have emerged as powerful generative models with
unprecedented image generation capability. These models are widely used for data …
unprecedented image generation capability. These models are widely used for data …
Argue: Attribute-guided prompt tuning for vision-language models
Although soft prompt tuning is effective in efficiently adapting Vision-Language (V&L)
models for downstream tasks it shows limitations in dealing with distribution shifts. We …
models for downstream tasks it shows limitations in dealing with distribution shifts. We …
Generalized logit adjustment: Calibrating fine-tuned models by removing label bias in foundation models
Foundation models like CLIP allow zero-shot transfer on various tasks without additional
training data. Yet, the zero-shot performance is less competitive than a fully supervised one …
training data. Yet, the zero-shot performance is less competitive than a fully supervised one …
Improved visual fine-tuning with natural language supervision
Fine-tuning a visual pre-trained model can leverage the semantic information from large-
scale pre-training data and mitigate the over-fitting problem on downstream vision tasks with …
scale pre-training data and mitigate the over-fitting problem on downstream vision tasks with …
Fine-Tuning for Few-Shot Image Classification by Multimodal Prototype Regularization
Large pre-trained vision-language models, such as CLIP [Radford et al. 2021], have
demonstrated remarkable performance in few-shot image classification. To facilitate the …
demonstrated remarkable performance in few-shot image classification. To facilitate the …
Robust Fine-tuning of Zero-shot Models via Variance Reduction
When fine-tuning zero-shot models like CLIP, our desideratum is for the fine-tuned model to
excel in both in-distribution (ID) and out-of-distribution (OOD). Recently, ensemble-based …
excel in both in-distribution (ID) and out-of-distribution (OOD). Recently, ensemble-based …
Identifying implicit social biases in vision-language models
Vision-language models, like CLIP (Contrastive Language Image Pretraining), are
becoming increasingly popular for a wide range of multimodal retrieval tasks. However, prior …
becoming increasingly popular for a wide range of multimodal retrieval tasks. However, prior …
Selective vision-language subspace projection for few-shot CLIP
Vision-language models such as CLIP are capable of map** the different modality data
into a unified feature space, enabling zero/few-shot inference by measuring the similarity of …
into a unified feature space, enabling zero/few-shot inference by measuring the similarity of …