A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities
Few-shot learning (FSL) has emerged as an effective learning method and shows great
potential. Despite the recent creative works in tackling FSL tasks, learning valid information …
potential. Despite the recent creative works in tackling FSL tasks, learning valid information …
Weakly supervised object localization and detection: A survey
As an emerging and challenging problem in the computer vision community, weakly
supervised object localization and detection plays an important role for develo** new …
supervised object localization and detection plays an important role for develo** new …
Scaling vision transformers to 22 billion parameters
The scaling of Transformers has driven breakthrough capabilities for language models. At
present, the largest large language models (LLMs) contain upwards of 100B parameters …
present, the largest large language models (LLMs) contain upwards of 100B parameters …
Revisiting class-incremental learning with pre-trained models: Generalizability and adaptivity are all you need
Class-incremental learning (CIL) aims to adapt to emerging new classes without forgetting
old ones. Traditional CIL models are trained from scratch to continually acquire knowledge …
old ones. Traditional CIL models are trained from scratch to continually acquire knowledge …
Segment and Recognize Anything at Any Granularity
In this work, we introduce Semantic-SAM, an augmented image segmentation foundation for
segmenting and recognizing anything at desired granularities. Compared to the …
segmenting and recognizing anything at desired granularities. Compared to the …
Visual prompt tuning
The current modus operandi in adapting pre-trained models involves updating all the
backbone parameters, ie., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) …
backbone parameters, ie., full fine-tuning. This paper introduces Visual Prompt Tuning (VPT) …
Slip: Self-supervision meets language-image pre-training
Recent work has shown that self-supervised pre-training leads to improvements over
supervised learning on challenging visual recognition tasks. CLIP, an exciting new …
supervised learning on challenging visual recognition tasks. CLIP, an exciting new …
Delving into out-of-distribution detection with vision-language representations
Recognizing out-of-distribution (OOD) samples is critical for machine learning systems
deployed in the open world. The vast majority of OOD detection methods are driven by a …
deployed in the open world. The vast majority of OOD detection methods are driven by a …
Forward compatible few-shot class-incremental learning
Novel classes frequently arise in our dynamically changing world, eg, new users in the
authentication system, and a machine learning model should recognize new classes without …
authentication system, and a machine learning model should recognize new classes without …
Conformer: Local features coupling global representations for visual recognition
Abstract Within Convolutional Neural Network (CNN), the convolution operations are good
at extracting local features but experience difficulty to capture global representations. Within …
at extracting local features but experience difficulty to capture global representations. Within …