Advances in medical image analysis with vision transformers: a comprehensive review

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2024 - Elsevier
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Lvm-med: Learning large-scale self-supervised vision models for medical imaging via second-order graph matching

D MH Nguyen, H Nguyen, N Diep… - Advances in …, 2023 - proceedings.neurips.cc
Obtaining large pre-trained models that can be fine-tuned to new tasks with limited
annotated samples has remained an open challenge for medical imaging data. While pre …

Pre-training auto-generated volumetric shapes for 3d medical image segmentation

R Tadokoro, R Yamada… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In 3D medical image segmentation, data collection and annotation costs require significant
human efforts. Moreover, obtaining training data is challenging due to privacy constraints …

Dynamic graph clustering learning for unsupervised diabetic retinopathy classification

C Yu, H Pei - Diagnostics, 2023 - mdpi.com
Diabetic retinopathy (DR) is a common complication of diabetes, which can lead to vision
loss. Early diagnosis is crucial to prevent the progression of DR. In recent years, deep …

Accelerating Transformers with Spectrum-Preserving Token Merging

HC Tran, DMH Nguyen, DM Nguyen… - arxiv preprint arxiv …, 2024 - arxiv.org
Increasing the throughput of the Transformer architecture, a foundational component used in
numerous state-of-the-art models for vision and language tasks (eg, GPT, LLaVa), is an …

Disruptive Autoencoders: Leveraging Low-level features for 3D Medical Image Pre-training

JMJ Valanarasu, Y Tang, D Yang, Z Xu, C Zhao… - arxiv preprint arxiv …, 2023 - arxiv.org
Harnessing the power of pre-training on large-scale datasets like ImageNet forms a
fundamental building block for the progress of representation learning-driven solutions in …

Drg-net: interactive joint learning of multi-lesion segmentation and classification for diabetic retinopathy grading

HM Tusfiqur, DMH Nguyen, MTN Truong… - arxiv preprint arxiv …, 2022 - arxiv.org
Diabetic Retinopathy (DR) is a leading cause of vision loss in the world, and early DR
detection is necessary to prevent vision loss and support an appropriate treatment. In this …

Unified Medical Image Pre-training in Language-Guided Common Semantic Space

X He, Y Yang, X Jiang, X Luo, H Hu, S Zhao… - … on Computer Vision, 2024 - Springer
Abstract Vision-Language Pre-training (VLP) has shown the merits of analysing medical
images. It efficiently learns visual representations by leveraging supervisions in their …

KDC-MAE: Knowledge Distilled Contrastive Mask Auto-Encoder

M Bora, S Atreya, A Mukherjee, A Das - arxiv preprint arxiv:2411.12270, 2024 - arxiv.org
In this work, we attempted to extend the thought and showcase a way forward for the Self-
supervised Learning (SSL) learning paradigm by combining contrastive learning, self …

Primitive Geometry Segment Pre-training for 3D Medical Image Segmentation

R Tadokoro, R Yamada, K Nakashima… - arxiv preprint arxiv …, 2024 - arxiv.org
The construction of 3D medical image datasets presents several issues, including requiring
significant financial costs in data collection and specialized expertise for annotation, as well …