Survey on self-supervised learning: auxiliary pretext tasks and contrastive learning methods in imaging

S Albelwi - Entropy, 2022 - mdpi.com
Although deep learning algorithms have achieved significant progress in a variety of
domains, they require costly annotations on huge datasets. Self-supervised learning (SSL) …

Docformer: End-to-end transformer for document understanding

S Appalaraju, B Jasani, BU Kota… - Proceedings of the …, 2021 - openaccess.thecvf.com
We present DocFormer-a multi-modal transformer based architecture for the task of Visual
Document Understanding (VDU). VDU is a challenging problem which aims to understand …

Docformerv2: Local features for document understanding

S Appalaraju, P Tang, Q Dong, N Sankaran… - Proceedings of the …, 2024 - ojs.aaai.org
We propose DocFormerv2, a multi-modal transformer for Visual Document Understanding
(VDU). The VDU domain entails understanding documents (beyond mere OCR predictions) …

Remix: A general and efficient framework for multiple instance learning based whole slide image classification

J Yang, H Chen, Y Zhao, F Yang, Y Zhang, L He… - … Conference on Medical …, 2022 - Springer
Whole slide image (WSI) classification often relies on deep weakly supervised multiple
instance learning (MIL) methods to handle gigapixel resolution images and slide-level …

Beyond supervised vs. unsupervised: Representative benchmarking and analysis of image representation learning

M Gwilliam, A Shrivastava - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com
By leveraging contrastive learning, clustering, and other pretext tasks, unsupervised
methods for learning image representations have reached impressive results on standard …

Emp-ssl: Towards self-supervised learning in one training epoch

S Tong, Y Chen, Y Ma, Y Lecun - arxiv preprint arxiv:2304.03977, 2023 - arxiv.org
Recently, self-supervised learning (SSL) has achieved tremendous success in learning
image representation. Despite the empirical success, most self-supervised learning methods …

Self-supervised video representation learning using improved instance-wise contrastive learning and deep clustering

Y Zhu, H Shuai, G Liu, Q Liu - IEEE Transactions on Circuits …, 2022 - ieeexplore.ieee.org
Instance-wise contrastive learning (Instance-CL), which learns to map similar instances
closer and different instances farther apart in the embedding space, has achieved …

Evaluating self-supervised learning via risk decomposition

Y Dubois, T Hashimoto, P Liang - … Conference on Machine …, 2023 - proceedings.mlr.press
Self-supervised learning (SSL) is typically evaluated using a single metric (linear probing on
ImageNet), which neither provides insight into tradeoffs between models nor highlights how …

Guillotine regularization: Why removing layers is needed to improve generalization in self-supervised learning

F Bordes, R Balestriero, Q Garrido, A Bardes… - arxiv preprint arxiv …, 2022 - arxiv.org
One unexpected technique that emerged in recent years consists in training a Deep Network
(DN) with a Self-Supervised Learning (SSL) method, and using this network on downstream …

Deciphering the projection head: Representation evaluation self-supervised learning

J Ma, T Hu, W Wang - arxiv preprint arxiv:2301.12189, 2023 - arxiv.org
Self-supervised learning (SSL) aims to learn intrinsic features without labels. Despite the
diverse architectures of SSL methods, the projection head always plays an important role in …