Medical visual question answering: A survey
Abstract Medical Visual Question Answering (VQA) is a combination of medical artificial
intelligence and popular VQA challenges. Given a medical image and a clinically relevant …
intelligence and popular VQA challenges. Given a medical image and a clinically relevant …
Pmc-vqa: Visual instruction tuning for medical visual question answering
Pmc-clip: Contrastive language-image pre-training using biomedical documents
Foundation models trained on large-scale dataset gain a recent surge in CV and NLP. In
contrast, development in biomedical domain lags far behind due to data scarcity. To address …
contrast, development in biomedical domain lags far behind due to data scarcity. To address …
Multi-modal masked autoencoders for medical vision-and-language pre-training
Medical vision-and-language pre-training provides a feasible solution to extract effective
vision-and-language representations from medical images and texts. However, few studies …
vision-and-language representations from medical images and texts. However, few studies …
Align, reason and learn: Enhancing medical vision-and-language pre-training with knowledge
Medical vision-and-language pre-training (Med-VLP) has received considerable attention
owing to its applicability to extracting generic vision-and-language representations from …
owing to its applicability to extracting generic vision-and-language representations from …
Does clip benefit visual question answering in the medical domain as much as it does in the general domain?
Contrastive Language--Image Pre-training (CLIP) has shown remarkable success in
learning with cross-modal supervision from extensive amounts of image--text pairs collected …
learning with cross-modal supervision from extensive amounts of image--text pairs collected …
Pubmedclip: How much does clip benefit visual question answering in the medical domain?
Abstract Contrastive Language–Image Pre-training (CLIP) has shown remarkable success
in learning with cross-modal supervision from extensive amounts of image–text pairs …
in learning with cross-modal supervision from extensive amounts of image–text pairs …
Open-ended medical visual question answering through prefix tuning of language models
Abstract Medical Visual Question Answering (VQA) is an important challenge, as it would
lead to faster and more accurate diagnoses and treatment decisions. Most existing methods …
lead to faster and more accurate diagnoses and treatment decisions. Most existing methods …
Towards unifying medical vision-and-language pre-training via soft prompts
Medical vision-and-language pre-training (Med-VLP) has shown promising improvements
on many downstream medical tasks owing to its applicability to extracting generic …
on many downstream medical tasks owing to its applicability to extracting generic …