Pre-trained language models in biomedical domain: A systematic survey

B Wang, Q **e, J Pei, Z Chen, P Tiwari, Z Li… - ACM Computing …, 2023 - dl.acm.org
Pre-trained language models (PLMs) have been the de facto paradigm for most natural
language processing tasks. This also benefits the biomedical domain: researchers from …

Clip in medical imaging: A comprehensive survey

Z Zhao, Y Liu, H Wu, M Wang, Y Li, S Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training
paradigm, successfully introduces text supervision to vision models. It has shown promising …

Making the most of text semantics to improve biomedical vision–language processing

B Boecking, N Usuyama, S Bannur, DC Castro… - European conference on …, 2022 - Springer
Multi-modal data abounds in biomedicine, such as radiology images and reports.
Interpreting this data at scale is essential for improving clinical care and accelerating clinical …

Contrastive learning of medical visual representations from paired images and text

Y Zhang, H Jiang, Y Miura… - Machine Learning …, 2022 - proceedings.mlr.press
Learning visual representations of medical images (eg, X-rays) is core to medical image
understanding but its progress has been held back by the scarcity of human annotations …

Knowledge-enhanced visual-language pre-training on chest radiology images

X Zhang, C Wu, Y Zhang, W **e, Y Wang - Nature Communications, 2023 - nature.com
While multi-modal foundation models pre-trained on large-scale data have been successful
in natural language understanding and vision recognition, their use in medical domains is …

[PDF][PDF] Large-scale domain-specific pretraining for biomedical vision-language processing

S Zhang, Y Xu, N Usuyama, J Bagga… - arxiv preprint arxiv …, 2023 - researchgate.net
Contrastive pretraining on parallel image-text data has attained great success in vision-
language processing (VLP), as exemplified by CLIP and related methods. However, prior …

Multi-granularity cross-modal alignment for generalized medical visual representation learning

F Wang, Y Zhou, S Wang… - Advances in Neural …, 2022 - proceedings.neurips.cc
Learning medical visual representations directly from paired radiology reports has become
an emerging topic in representation learning. However, existing medical image-text joint …

Medklip: Medical knowledge enhanced language-image pre-training for x-ray diagnosis

C Wu, X Zhang, Y Zhang, Y Wang… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we consider enhancing medical visual-language pre-training (VLP) with
domain-specific knowledge, by exploiting the paired image-text reports from the radiological …

Prior: Prototype representation joint learning from medical images and reports

P Cheng, L Lin, J Lyu, Y Huang… - Proceedings of the …, 2023 - openaccess.thecvf.com
Contrastive learning based vision-language joint pre-training has emerged as a successful
representation learning strategy. In this paper, we present a prototype representation …

Lvit: language meets vision transformer in medical image segmentation

Z Li, Y Li, Q Li, P Wang, D Guo, L Lu… - IEEE transactions on …, 2023 - ieeexplore.ieee.org
Deep learning has been widely used in medical image segmentation and other aspects.
However, the performance of existing medical image segmentation models has been limited …