Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Clip in medical imaging: A comprehensive survey
Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training
paradigm, successfully introduces text supervision to vision models. It has shown promising …
paradigm, successfully introduces text supervision to vision models. It has shown promising …
Visual tuning
Fine-tuning visual models has been widely shown promising performance on many
downstream visual tasks. With the surprising development of pre-trained visual foundation …
downstream visual tasks. With the surprising development of pre-trained visual foundation …
A systematic survey of prompt engineering on vision-language foundation models
Prompt engineering is a technique that involves augmenting a large pre-trained model with
task-specific hints, known as prompts, to adapt the model to new tasks. Prompts can be …
task-specific hints, known as prompts, to adapt the model to new tasks. Prompts can be …
A pilot study of query-free adversarial attack against stable diffusion
Despite the record-breaking performance in Text-to-Image (T2I) generation by Stable
Diffusion, less research attention is paid to its adversarial robustness. In this work, we study …
Diffusion, less research attention is paid to its adversarial robustness. In this work, we study …
One prompt word is enough to boost adversarial robustness for pre-trained vision-language models
Abstract Large pre-trained Vision-Language Models (VLMs) like CLIP despite having
remarkable generalization ability are highly vulnerable to adversarial examples. This work …
remarkable generalization ability are highly vulnerable to adversarial examples. This work …
Robust clip: Unsupervised adversarial fine-tuning of vision embeddings for robust large vision-language models
Multi-modal foundation models like OpenFlamingo, LLaVA, and GPT-4 are increasingly
used for various real-world tasks. Prior work has shown that these models are highly …
used for various real-world tasks. Prior work has shown that these models are highly …
Towards calibrated robust fine-tuning of vision-language models
Improving out-of-distribution (OOD) generalization during in-distribution (ID) adaptation is a
primary goal of robust fine-tuning of zero-shot models beyond naive fine-tuning. However …
primary goal of robust fine-tuning of zero-shot models beyond naive fine-tuning. However …
Not all prompts are secure: A switchable backdoor attack against pre-trained vision transfomers
Given the power of vision transformers a new learning paradigm pre-training and then
prompting makes it more efficient and effective to address downstream visual recognition …
prompting makes it more efficient and effective to address downstream visual recognition …
Imagenet-d: Benchmarking neural network robustness on diffusion synthetic object
We establish rigorous benchmarks for visual perception robustness. Synthetic images such
as ImageNet-C ImageNet-9 and Stylized ImageNet provide specific type of evaluation over …
as ImageNet-C ImageNet-9 and Stylized ImageNet provide specific type of evaluation over …
Pre-trained model guided fine-tuning for zero-shot adversarial robustness
Large-scale pre-trained vision-language models like CLIP have demonstrated impressive
performance across various tasks and exhibit remarkable zero-shot generalization capability …
performance across various tasks and exhibit remarkable zero-shot generalization capability …