A survey on self-supervised learning: Algorithms, applications, and future trends

J Gui, T Chen, J Zhang, Q Cao, Z Sun… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …

Clip in medical imaging: A comprehensive survey

Z Zhao, Y Liu, H Wu, M Wang, Y Li, S Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training
paradigm, successfully introduces text supervision to vision models. It has shown promising …

Efficientsam: Leveraged masked image pretraining for efficient segment anything

Y **ong, B Varadarajan, L Wu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Segment Anything Model (SAM) has emerged as a powerful tool for numerous
vision applications. A key component that drives the impressive performance for zero-shot …

Cut and learn for unsupervised object detection and instance segmentation

X Wang, R Girdhar, SX Yu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Abstract We propose Cut-and-LEaRn (CutLER), a simple approach for training
unsupervised object detection and segmentation models. We leverage the property of self …

Spot-the-difference self-supervised pre-training for anomaly detection and segmentation

Y Zou, J Jeong, L Pemula, D Zhang… - European Conference on …, 2022 - Springer
Visual anomaly detection is commonly used in industrial quality inspection. In this paper, we
present a new dataset as well as a new self-supervised learning method for ImageNet pre …

Context autoencoder for self-supervised representation learning

X Chen, M Ding, X Wang, Y **n, S Mo, Y Wang… - International Journal of …, 2024 - Springer
We present a novel masked image modeling (MIM) approach, context autoencoder (CAE),
for self-supervised representation pretraining. We pretrain an encoder by making predictions …

Denseclip: Language-guided dense prediction with context-aware prompting

Y Rao, W Zhao, G Chen, Y Tang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Recent progress has shown that large-scale pre-training using contrastive image-text pairs
can be a promising alternative for high-quality visual representation learning from natural …

ibot: Image bert pre-training with online tokenizer

J Zhou, C Wei, H Wang, W Shen, C **e, A Yuille… - arxiv preprint arxiv …, 2021 - arxiv.org
The success of language Transformers is primarily attributed to the pretext task of masked
language modeling (MLM), where texts are first tokenized into semantically meaningful …

Cris: Clip-driven referring image segmentation

Z Wang, Y Lu, Q Li, X Tao, Y Guo… - Proceedings of the …, 2022 - openaccess.thecvf.com
Referring image segmentation aims to segment a referent via a natural linguistic expression.
Due to the distinct data properties between text and image, it is challenging for a network to …

Ts2vec: Towards universal representation of time series

Z Yue, Y Wang, J Duan, T Yang, C Huang… - Proceedings of the …, 2022 - ojs.aaai.org
This paper presents TS2Vec, a universal framework for learning representations of time
series in an arbitrary semantic level. Unlike existing methods, TS2Vec performs contrastive …