Clip in medical imaging: A comprehensive survey

Z Zhao, Y Liu, H Wu, M Wang, Y Li, S Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training
paradigm, successfully introduces text supervision to vision models. It has shown promising …

[HTML][HTML] Generative Artificial Intellegence (AI) in Pathology and Medicine: A Deeper Dive

HH Rashidi, J Pantanowitz, A Chamanzar, B Fennell… - Modern Pathology, 2024 - Elsevier
This review article builds upon the introductory piece in our seven-part series, delving
deeper into the transformative potential of generative artificial intelligence (Gen AI) in …

One model to rule them all: Towards universal segmentation for medical images with text prompts

Z Zhao, Y Zhang, C Wu, X Zhang, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
In this study, we focus on building up a model that can Segment Anything in medical
scenarios, driven by Text prompts, termed as SAT. Our main contributions are three folds:(i) …

MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise

R Wu, C Zhang, J Zhang, Y Zhou, T Zhou… - … Conference on Medical …, 2024 - Springer
Current fundus image analysis models are predominantly built for specific tasks relying on
individual datasets. The learning process is usually based on data-driven paradigm without …

T3d: Towards 3d medical image understanding through vision-language pre-training

C Liu, C Ouyang, Y Chen, CC Quilodrán-Casas… - arxiv preprint arxiv …, 2023 - arxiv.org
Expert annotation of 3D medical image for downstream analysis is resource-intensive,
posing challenges in clinical applications. Visual self-supervised learning (vSSL), though …

Medical vision language pretraining: A survey

P Shrestha, S Amgain, B Khanal, CA Linte… - arxiv preprint arxiv …, 2023 - arxiv.org
Medical Vision Language Pretraining (VLP) has recently emerged as a promising solution to
the scarcity of labeled data in the medical domain. By leveraging paired/unpaired vision and …

Foundation model for advancing healthcare: Challenges, opportunities, and future directions

Y He, F Huang, X Jiang, Y Nie, M Wang, J Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Foundation model, which is pre-trained on broad data and is able to adapt to a wide range
of tasks, is advancing healthcare. It promotes the development of healthcare artificial …

Medical Multimodal Foundation Models in Clinical Diagnosis and Treatment: Applications, Challenges, and Future Directions

K Sun, S Xue, F Sun, H Sun, Y Luo, L Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in deep learning have significantly revolutionized the field of clinical
diagnosis and treatment, offering novel approaches to improve diagnostic precision and …

Brain-Adapter: Enhancing Neurological Disorder Analysis with Adapter-Tuning Multimodal Large Language Models

J Zhang, X Yu, Y Lyu, L Zhang, T Chen, C Cao… - arxiv preprint arxiv …, 2025 - arxiv.org
Understanding brain disorders is crucial for accurate clinical diagnosis and treatment.
Recent advances in Multimodal Large Language Models (MLLMs) offer a promising …

Tumor Location-weighted MRI-Report Contrastive Learning: A Framework for Improving the Explainability of Pediatric Brain Tumor Diagnosis

S Ketabi, MW Wagner, C Hawkins, U Tabori… - arxiv preprint arxiv …, 2024 - arxiv.org
Despite the promising performance of convolutional neural networks (CNNs) in brain tumor
diagnosis from magnetic resonance imaging (MRI), their integration into the clinical workflow …