Vision transformer architecture and applications in digital health: a tutorial and survey

K Al-Hammuri, F Gebali, A Kanan… - Visual computing for …, 2023 - Springer
The vision transformer (ViT) is a state-of-the-art architecture for image recognition tasks that
plays an important role in digital health applications. Medical images account for 90% of the …

Supervised masked knowledge distillation for few-shot transformers

H Lin, G Han, J Ma, S Huang, X Lin… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Vision Transformers (ViTs) emerge to achieve impressive performance on many
data-abundant computer vision tasks by capturing long-range dependencies among local …

Neural clustering based visual representation learning

G Chen, X Li, Y Yang, W Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
We investigate a fundamental aspect of machine vision: the measurement of features by
revisiting clustering one of the most classic approaches in machine learning and data …

Mimetic initialization of self-attention layers

A Trockman, JZ Kolter - International Conference on …, 2023 - proceedings.mlr.press
It is notoriously difficult to train Transformers on small datasets; typically, large pre-trained
models are instead used as the starting point. We explore the weights of such pre-trained …

Frozen feature augmentation for few-shot image classification

A Bär, N Houlsby, M Dehghani… - Proceedings of the …, 2024 - openaccess.thecvf.com
Training a linear classifier or lightweight model on top of pretrained vision model outputs so-
called'frozen features' leads to impressive performance on a number of downstream few …

Vision transformer promotes cancer diagnosis: A comprehensive review

X Jiang, S Wang, Y Zhang - Expert Systems with Applications, 2024 - Elsevier
Background The approaches based on vision transformers (ViTs) are advancing the field of
medical artificial intelligence (AI) and cancer diagnosis. Recently, many researchers have …

Swin MAE: masked autoencoders for small datasets

Y Dai, F Liu, W Chen, Y Liu, L Shi, S Liu… - Computers in biology and …, 2023 - Elsevier
The development of deep learning models in medical image analysis is majorly limited by
the lack of large-sized and well-annotated datasets. Unsupervised learning does not require …

Scalarization for multi-task and multi-domain learning at scale

A Royer, T Blankevoort… - Advances in Neural …, 2023 - proceedings.neurips.cc
Training a single model on multiple input domains and/or output tasks allows for
compressing information from multiple sources into a unified backbone hence improves …

Initializing models with larger ones

Z Xu, Y Chen, K Vishniakov, Y Yin, Z Shen… - arxiv preprint arxiv …, 2023 - arxiv.org
Weight initialization plays an important role in neural network training. Widely used
initialization methods are proposed and evaluated for networks that are trained from scratch …

Diffusion-based neural network weights generation

B Soro, B Andreis, H Lee, W Jeong, S Chong… - arxiv preprint arxiv …, 2024 - arxiv.org
Transfer learning has gained significant attention in recent deep learning research due to its
ability to accelerate convergence and enhance performance on new tasks. However, its …