Llava-med: Training a large language-and-vision assistant for biomedicine in one day

C Li, C Wong, S Zhang, N Usuyama… - Advances in …, 2023‏ - proceedings.neurips.cc
Conversational generative AI has demonstrated remarkable promise for empowering
biomedical practitioners, but current investigations focus on unimodal text. Multimodal …

Xraygpt: Chest radiographs summarization using medical vision-language models

O Thawkar, A Shaker, SS Mullappilly… - ar** ChatGPT for biology and medicine: a complete review of biomedical question answering
Q Li, L Li, Y Li - Biophysics Reports, 2024‏ - pmc.ncbi.nlm.nih.gov
ChatGPT explores a strategic blueprint of question answering (QA) to deliver medical
diagnoses, treatment recommendations, and other healthcare support. This is achieved …

Surgical-lvlm: Learning to adapt large vision-language model for grounded visual question answering in robotic surgery

G Wang, L Bai, WJ Nah, J Wang, Z Zhang… - arxiv preprint arxiv …, 2024‏ - arxiv.org
Recent advancements in Surgical Visual Question Answering (Surgical-VQA) and related
region grounding have shown great promise for robotic and medical applications …

Ophglm: Training an ophthalmology large language-and-vision assistant based on instructions and dialogue

W Gao, Z Deng, Z Niu, F Rong, C Chen, Z Gong… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Large multimodal language models (LMMs) have achieved significant success in general
domains. However, due to the significant differences between medical images and text and …

Lapa: Latent prompt assist model for medical visual question answering

T Gu, K Yang, D Liu, W Cai - Proceedings of the IEEE/CVF …, 2024‏ - openaccess.thecvf.com
Medical visual question answering (Med-VQA) aims to automate the prediction of correct
answers for medical images and questions thereby assisting physicians in reducing …

A comprehensive study of gpt-4v's multimodal capabilities in medical imaging

Y Li, Y Liu, Z Wang, X Liang, L Liu, L Wang, L Cui, Z Tu… - medRxiv, 2023‏ - medrxiv.org
This paper presents a comprehensive evaluation of GPT-4V's capabilities across diverse
medical imaging tasks, including Radiology Report Generation, Medical Visual Question …

[HTML][HTML] A systematic evaluation of GPT-4V's multimodal capability for chest X-ray image analysis

Y Liu, Y Li, Z Wang, X Liang, L Liu, L Wang, L Cui, Z Tu… - Meta-Radiology, 2024‏ - Elsevier
This work evaluates GPT-4V's multimodal capability for medical image analysis, focusing on
three representative tasks radiology report generation, medical visual question answering …