Foundation model for advancing healthcare: challenges, opportunities and future directions

Y He, F Huang, X Jiang, Y Nie, M Wang… - IEEE Reviews in …, 2024 - ieeexplore.ieee.org
Foundation model, trained on a diverse range of data and adaptable to a myriad of tasks, is
advancing healthcare. It fosters the development of healthcare artificial intelligence (AI) …

Has multimodal learning delivered universal intelligence in healthcare? A comprehensive survey

Q Lin, Y Zhu, X Mei, L Huang, J Ma, K He, Z Peng… - Information …, 2024 - Elsevier
The rapid development of artificial intelligence has constantly reshaped the field of
intelligent healthcare and medicine. As a vital technology, multimodal learning has …

Pmc-vqa: Visual instruction tuning for medical visual question answering

X Zhang, C Wu, Z Zhao, W Lin, Y Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org
Medical Visual Question Answering (MedVQA) presents a significant opportunity to enhance
diagnostic accuracy and healthcare delivery by leveraging artificial intelligence to interpret …

MediConfusion: Can you trust your AI radiologist? Probing the reliability of multimodal medical foundation models

MS Sepehri, Z Fabian, M Soltanolkotabi… - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal Large Language Models (MLLMs) have tremendous potential to improve the
accuracy, availability, and cost-effectiveness of healthcare by providing automated solutions …

Mmt-bench: A comprehensive multimodal benchmark for evaluating large vision-language models towards multitask agi

K Ying, F Meng, J Wang, Z Li, H Lin, Y Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Vision-Language Models (LVLMs) show significant strides in general-purpose
multimodal applications such as visual dialogue and embodied navigation. However …

Huatuogpt-vision, towards injecting medical visual knowledge into multimodal llms at scale

J Chen, C Gui, R Ouyang, A Gao, S Chen… - arxiv preprint arxiv …, 2024 - arxiv.org
The rapid development of multimodal large language models (MLLMs), such as GPT-4V,
has led to significant advancements. However, these models still face challenges in medical …

Audiobench: A universal benchmark for audio large language models

B Wang, X Zou, G Lin, S Sun, Z Liu, W Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce AudioBench, a universal benchmark designed to evaluate Audio Large
Language Models (AudioLLMs). It encompasses 8 distinct tasks and 26 datasets, among …

[PDF][PDF] Gmai-mmbench: A comprehensive multimodal evaluation benchmark towards general medical ai

J Ye, G Wang, Y Li, Z Deng, W Li, T Li… - The Thirty-eight …, 2024 - proceedings.neurips.cc
Abstract Large Vision-Language Models (LVLMs) are capable of handling diverse data
types such as imaging, text, and physiological signals, and can be applied in various fields …

Towards injecting medical visual knowledge into multimodal llms at scale

J Chen, C Gui, R Ouyang, A Gao, S Chen… - Proceedings of the …, 2024 - aclanthology.org
The rapid development of multimodal large language models (MLLMs), such as GPT-4V,
has led to significant advancements. However, these models still face challenges in medical …

Medbench: A comprehensive, standardized, and reliable benchmarking system for evaluating chinese medical large language models

M Liu, W Hu, J Ding, J Xu, X Li, L Zhu… - Big Data Mining and …, 2024 - ieeexplore.ieee.org
Ensuring the general efficacy and benefit for human beings from medical Large Language
Models (LLM) before real-world deployment is crucial. However, a widely accepted and …