Advances in medical image analysis with vision transformers: a comprehensive review

R Azad, A Kazerouni, M Heidari, EK Aghdam… - Medical Image …, 2024 - Elsevier
The remarkable performance of the Transformer architecture in natural language processing
has recently also triggered broad interest in Computer Vision. Among other merits …

Transformers in medical imaging: A survey

F Shamshad, S Khan, SW Zamir, MH Khan… - Medical Image …, 2023 - Elsevier
Following unprecedented success on the natural language tasks, Transformers have been
successfully applied to several computer vision problems, achieving state-of-the-art results …

A generalist vision–language foundation model for diverse biomedical tasks

K Zhang, R Zhou, E Adhikarla, Z Yan, Y Liu, J Yu… - Nature Medicine, 2024 - nature.com
Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or
modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize …

A survey of large language models in medicine: Progress, application, and challenge

H Zhou, F Liu, B Gu, X Zou, J Huang, J Wu, Y Li… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs), such as ChatGPT, have received substantial attention due
to their capabilities for understanding and generating human language. While there has …

Dynamic graph enhanced contrastive learning for chest x-ray report generation

M Li, B Lin, Z Chen, H Lin, X Liang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Automatic radiology reporting has great clinical potential to relieve radiologists from heavy
workloads and improve diagnosis interpretation. Recently, researchers have enhanced data …

Interactive and explainable region-guided radiology report generation

T Tanida, P Müller, G Kaissis… - Proceedings of the …, 2023 - openaccess.thecvf.com
The automatic generation of radiology reports has the potential to assist radiologists in the
time-consuming task of report writing. Existing methods generate the full report from image …

From show to tell: A survey on deep learning-based image captioning

M Stefanini, M Cornia, L Baraldi… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Connecting Vision and Language plays an essential role in Generative Intelligence. For this
reason, large research efforts have been devoted to image captioning, ie describing images …

Metransformer: Radiology report generation by transformer with multiple learnable expert tokens

Z Wang, L Liu, L Wang, L Zhou - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In clinical scenarios, multi-specialist consultation could significantly benefit the diagnosis,
especially for intricate cases. This inspires us to explore a" multi-expert joint diagnosis" …

Biomedgpt: A unified and generalist biomedical generative pre-trained transformer for vision, language, and multimodal tasks

K Zhang, J Yu, E Adhikarla, R Zhou, Z Yan… - arxiv e …, 2023 - ui.adsabs.harvard.edu
Conventional task-and modality-specific artificial intelligence (AI) models are inflexible in
real-world deployment and maintenance for biomedicine. At the same time, the growing …

Kiut: Knowledge-injected u-transformer for radiology report generation

Z Huang, X Zhang, S Zhang - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Radiology report generation aims to automatically generate a clinically accurate and
coherent paragraph from the X-ray image, which could relieve radiologists from the heavy …