Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Instruction tuning for large language models: A survey
This paper surveys research works in the quickly advancing field of instruction tuning (IT),
which can also be referred to as supervised fine-tuning (SFT)\footnote {In this paper, unless …
which can also be referred to as supervised fine-tuning (SFT)\footnote {In this paper, unless …
A comprehensive survey on pretrained foundation models: A history from bert to chatgpt
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …
Segment anything
Abstract We introduce the Segment Anything (SA) project: a new task, model, and dataset for
image segmentation. Using our efficient model in a data collection loop, we built the largest …
image segmentation. Using our efficient model in a data collection loop, we built the largest …
Dinov2: Learning robust visual features without supervision
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …
quantities of data have opened the way for similar foundation models in computer vision …
Depth anything: Unleashing the power of large-scale unlabeled data
Abstract This work presents Depth Anything a highly practical solution for robust monocular
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
depth estimation. Without pursuing novel technical modules we aim to build a simple yet …
[PDF][PDF] The dawn of lmms: Preliminary explorations with gpt-4v (ision)
Large multimodal models (LMMs) extend large language models (LLMs) with multi-sensory
skills, such as visual understanding, to achieve stronger generic intelligence. In this paper …
skills, such as visual understanding, to achieve stronger generic intelligence. In this paper …
Open-vocabulary panoptic segmentation with text-to-image diffusion models
We present ODISE: Open-vocabulary DIffusion-based panoptic SEgmentation, which unifies
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …
pre-trained text-image diffusion and discriminative models to perform open-vocabulary …
Sam 2: Segment anything in images and videos
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving
promptable visual segmentation in images and videos. We build a data engine, which …
promptable visual segmentation in images and videos. We build a data engine, which …
Vision-language models for vision tasks: A survey
Most visual recognition studies rely heavily on crowd-labelled data in deep neural networks
(DNNs) training, and they usually train a DNN for each single visual recognition task …
(DNNs) training, and they usually train a DNN for each single visual recognition task …
Tool learning with foundation models
Humans possess an extraordinary ability to create and utilize tools. With the advent of
foundation models, artificial intelligence systems have the potential to be equally adept in …
foundation models, artificial intelligence systems have the potential to be equally adept in …