A Survey on Self-supervised Learning: Algorithms, Applications, and Future Trends

J Gui, T Chen, J Zhang, Q Cao, Z Sun… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Deep supervised learning algorithms typically require a large volume of labeled data to
achieve satisfactory performance. However, the process of collecting and labeling such data …

[HTML][HTML] Self-supervised learning for point cloud data: A survey

C Zeng, W Wang, A Nguyen, J **ao, Y Yue - Expert Systems with …, 2024 - Elsevier
Abstract 3D point clouds are a crucial type of data collected by LiDAR sensors and widely
used in transportation applications due to its concise descriptions and accurate localization …

Self-supervised co-training for video representation learning

T Han, W **e, A Zisserman - Advances in neural information …, 2020 - proceedings.neurips.cc
The objective of this paper is visual-only self-supervised video representation learning. We
make the following contributions:(i) we investigate the benefit of adding semantic-class …

A large-scale study on unsupervised spatiotemporal representation learning

C Feichtenhofer, H Fan, B **ong… - Proceedings of the …, 2021 - openaccess.thecvf.com
We present a large-scale study on unsupervised spatiotemporal representation learning
from videos. With a unified perspective on four recent image-based frameworks, we study a …

Vilbert: Pretraining task-agnostic visiolinguistic representations for vision-and-language tasks

J Lu, D Batra, D Parikh, S Lee - Advances in neural …, 2019 - proceedings.neurips.cc
We present ViLBERT (short for Vision-and-Language BERT), a model for learning task-
agnostic joint representations of image content and natural language. We extend the …

Anticipative video transformer

R Girdhar, K Grauman - Proceedings of the IEEE/CVF …, 2021 - openaccess.thecvf.com
Abstract We propose Anticipative Video Transformer (AVT), an end-to-end attention-based
video modeling architecture that attends to the previously observed video in order to …

Data-efficient image recognition with contrastive predictive coding

O Henaff - International conference on machine learning, 2020 - proceedings.mlr.press
Human observers can learn to recognize new categories of images from a handful of
examples, yet doing so with artificial ones remains an open challenge. We hypothesize that …

Self-supervised visual feature learning with deep neural networks: A survey

L **g, Y Tian - IEEE transactions on pattern analysis and …, 2020 - ieeexplore.ieee.org
Large-scale labeled data are generally required to train deep neural networks in order to
obtain better performance in visual feature learning from images or videos for computer …

Space-time correspondence as a contrastive random walk

A Jabri, A Owens, A Efros - Advances in neural information …, 2020 - proceedings.neurips.cc
This paper proposes a simple self-supervised approach for learning a representation for
visual correspondence from raw video. We cast correspondence as prediction of links in a …

Self-supervised learning for medical image analysis using image context restoration

L Chen, P Bentley, K Mori, K Misawa, M Fujiwara… - Medical image …, 2019 - Elsevier
Abstract Machine learning, particularly deep learning has boosted medical image analysis
over the past years. Training a good model based on deep learning requires large amount …