Has multimodal learning delivered universal intelligence in healthcare? A comprehensive survey

Q Lin, Y Zhu, X Mei, L Huang, J Ma, K He, Z Peng… - Information …, 2024 - Elsevier
The rapid development of artificial intelligence has constantly reshaped the field of
intelligent healthcare and medicine. As a vital technology, multimodal learning has …

A generalist vision–language foundation model for diverse biomedical tasks

K Zhang, R Zhou, E Adhikarla, Z Yan, Y Liu, J Yu… - Nature Medicine, 2024 - nature.com
Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or
modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize …

[HTML][HTML] Faithful AI in medicine: a systematic review with large language models and beyond

Q **e, EJ Schenck, HS Yang, Y Chen, Y Peng, F Wang - MedRxiv, 2023 - ncbi.nlm.nih.gov
Artificial intelligence (AI), especially the most recent large language models (LLMs), holds
great promise in healthcare and medicine, with applications spanning from biological …

Biomedgpt: A unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks

K Zhang, J Yu, E Adhikarla, R Zhou, Z Yan… - arxiv e …, 2023 - ui.adsabs.harvard.edu
Conventional task-and modality-specific artificial intelligence (AI) models are inflexible in
real-world deployment and maintenance for biomedicine. At the same time, the growing …

A vision–language foundation model for the generation of realistic chest x-ray images

C Bluethgen, P Chambon, JB Delbrouck… - Nature Biomedical …, 2024 - nature.com
The paucity of high-quality medical imaging datasets could be mitigated by machine
learning models that generate compositionally diverse images that faithfully represent …

Medklip: Medical knowledge enhanced language-image pre-training for x-ray diagnosis

C Wu, X Zhang, Y Zhang, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we consider enhancing medical visual-language pre-training (VLP) with
domain-specific knowledge, by exploiting the paired image-text reports from the radiological …

Roentgen: vision-language foundation model for chest x-ray generation

P Chambon, C Bluethgen, JB Delbrouck… - arxiv preprint arxiv …, 2022 - arxiv.org
Multimodal models trained on large natural image-text pair datasets have exhibited
astounding abilities in generating high-quality images. Medical imaging data is …

Maira-2: Grounded radiology report generation

S Bannur, K Bouzid, DC Castro, A Schwaighofer… - arxiv preprint arxiv …, 2024 - arxiv.org
Radiology reporting is a complex task requiring detailed medical image understanding and
precise language generation, for which generative multimodal models offer a promising …

RadAdapt: Radiology report summarization via lightweight domain adaptation of large language models

D Van Veen, C Van Uden, M Attias, A Pareek… - arxiv preprint arxiv …, 2023 - arxiv.org
We systematically investigate lightweight strategies to adapt large language models (LLMs)
for the task of radiology report summarization (RRS). Specifically, we focus on domain …

Maira-1: A specialised large multimodal model for radiology report generation

SL Hyland, S Bannur, K Bouzid, DC Castro… - arxiv preprint arxiv …, 2023 - arxiv.org
We present a radiology-specific multimodal model for the task for generating radiological
reports from chest X-rays (CXRs). Our work builds on the idea that large language model (s) …