Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Parameter-efficient fine-tuning for large models: A comprehensive survey
Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …
enabling remarkable achievements across various tasks. However, their unprecedented …
Vision-language pre-training: Basics, recent advances, and future trends
This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …
intelligence that have been developed in the last few years. We group these approaches …
Grounding dino: Marrying dino with grounded pre-training for open-set object detection
In this paper, we develop an open-set object detector, called Grounding DINO, by marrying
Transformer-based detector DINO with grounded pre-training, which can detect arbitrary …
Transformer-based detector DINO with grounded pre-training, which can detect arbitrary …
Llama-adapter: Efficient fine-tuning of language models with zero-init attention
We present LLaMA-Adapter, a lightweight adaption method to efficiently fine-tune LLaMA
into an instruction-following model. Using 52K self-instruct demonstrations, LLaMA-Adapter …
into an instruction-following model. Using 52K self-instruct demonstrations, LLaMA-Adapter …
Repurposing diffusion-based image generators for monocular depth estimation
Monocular depth estimation is a fundamental computer vision task. Recovering 3D depth
from a single image is geometrically ill-posed and requires scene understanding so it is not …
from a single image is geometrically ill-posed and requires scene understanding so it is not …
Vision-language models for vision tasks: A survey
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …
(DNNs) training, and they usually train a DNN for each single visual recognition task …
Adding conditional control to text-to-image diffusion models
We present ControlNet, a neural network architecture to add spatial conditioning controls to
large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large …
large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large …
A visual-language foundation model for computational pathology
The accelerated adoption of digital pathology and advances in deep learning have enabled
the development of robust models for various pathology tasks across a diverse array of …
the development of robust models for various pathology tasks across a diverse array of …
Multi-concept customization of text-to-image diffusion
While generative models produce high-quality images of concepts learned from a large-
scale database, a user often wishes to synthesize instantiations of their own concepts (for …
scale database, a user often wishes to synthesize instantiations of their own concepts (for …
Maple: Multi-modal prompt learning
Pre-trained vision-language (VL) models such as CLIP have shown excellent generalization
ability to downstream tasks. However, they are sensitive to the choice of input text prompts …
ability to downstream tasks. However, they are sensitive to the choice of input text prompts …