Has multimodal learning delivered universal intelligence in healthcare? A comprehensive survey

Q Lin, Y Zhu, X Mei, L Huang, J Ma, K He, Z Peng… - Information …, 2024 - Elsevier
The rapid development of artificial intelligence has constantly reshaped the field of
intelligent healthcare and medicine. As a vital technology, multimodal learning has …

Maira-1: A specialised large multimodal model for radiology report generation

SL Hyland, S Bannur, K Bouzid, DC Castro… - arxiv preprint arxiv …, 2023 - arxiv.org
We present a radiology-specific multimodal model for the task for generating radiological
reports from chest X-rays (CXRs). Our work builds on the idea that large language model (s) …

Maira-2: Grounded radiology report generation

S Bannur, K Bouzid, DC Castro, A Schwaighofer… - arxiv preprint arxiv …, 2024 - arxiv.org
Radiology reporting is a complex task requiring detailed medical image understanding and
precise language generation, for which generative multimodal models offer a promising …

Medimageinsight: An open-source embedding model for general domain medical imaging

NCF Codella, Y **, S Jain, Y Gu, HH Lee… - arxiv preprint arxiv …, 2024 - arxiv.org
In this work, we present MedImageInsight, an open-source medical imaging embedding
model. MedImageInsight is trained on medical images with associated text and labels …

Can Medical Vision-Language Pre-training Succeed with Purely Synthetic Data?

C Liu, Z Wan, H Wang, Y Chen, T Qaiser, C **… - arxiv preprint arxiv …, 2024 - arxiv.org
Medical Vision-Language Pre-training (MedVLP) has made significant progress in enabling
zero-shot tasks for medical image understanding. However, training MedVLP models …

Large-scale benchmarking and boosting transfer learning for medical image analysis

MRH Taher, F Haghighi, MB Gotway, J Liang - Medical Image Analysis, 2025 - Elsevier
Transfer learning, particularly fine-tuning models pretrained on photographic images to
medical images, has proven indispensable for medical image analysis. There are numerous …

LLM-RG4: Flexible and Factual Radiology Report Generation across Diverse Input Contexts

Z Wang, Y Sun, Z Li, X Yang, F Chen, H Liao - arxiv preprint arxiv …, 2024 - arxiv.org
Drafting radiology reports is a complex task requiring flexibility, where radiologists tail
content to available information and particular clinical demands. However, most current …

An X-Ray Is Worth 15 Features: Sparse Autoencoders for Interpretable Radiology Report Generation

A Abdulaal, H Fry, N Montaña-Brown… - arxiv preprint arxiv …, 2024 - arxiv.org
Radiological services are experiencing unprecedented demand, leading to increased
interest in automating radiology report generation. Existing Vision-Language Models (VLMs) …

DiNO-Diffusion. Scaling Medical Diffusion via Self-Supervised Pre-Training

G Jimenez-Perez, P Osorio, J Cersovsky… - arxiv preprint arxiv …, 2024 - arxiv.org
Diffusion models (DMs) have emerged as powerful foundation models for a variety of tasks,
with a large focus in synthetic image generation. However, their requirement of large …

M4CXR: Exploring Multi-task Potentials of Multi-modal Large Language Models for Chest X-ray Interpretation

J Park, S Kim, B Yoon, J Hyun, K Choi - arxiv preprint arxiv:2408.16213, 2024 - arxiv.org
The rapid evolution of artificial intelligence, especially in large language models (LLMs), has
significantly impacted various domains, including healthcare. In chest X-ray (CXR) analysis …