Unleashing the potential of prompt engineering in large language models: a comprehensive review

B Chen, Z Zhang, N Langrené, S Zhu - arxiv preprint arxiv:2310.14735, 2023 - arxiv.org
This comprehensive review delves into the pivotal role of prompt engineering in unleashing
the capabilities of Large Language Models (LLMs). The development of Artificial Intelligence …

Vision-language models for medical report generation and visual question answering: A review

I Hartsock, G Rasool - Frontiers in Artificial Intelligence, 2024 - frontiersin.org
Medical vision-language models (VLMs) combine computer vision (CV) and natural
language processing (NLP) to analyze visual and textual medical data. Our paper reviews …

ChatMOF: an artificial intelligence system for predicting and generating metal-organic frameworks using large language models

Y Kang, J Kim - Nature communications, 2024 - nature.com
ChatMOF is an artificial intelligence (AI) system that is built to predict and generate metal-
organic frameworks (MOFs). By leveraging a large-scale language model (GPT-4, GPT-3.5 …

Vision language models in autonomous driving: A survey and outlook

X Zhou, M Liu, E Yurtsever, BL Zagar… - IEEE Transactions …, 2024 - ieeexplore.ieee.org
The applications of Vision-Language Models (VLMs) in the field of Autonomous Driving (AD)
have attracted widespread attention due to their outstanding performance and the ability to …

Toward general-purpose robots via foundation models: A survey and meta-analysis

Y Hu, Q **e, V Jain, J Francis, J Patrikar… - arxiv preprint arxiv …, 2023 - arxiv.org
Building general-purpose robots that operate seamlessly in any environment, with any
object, and utilizing various skills to complete diverse tasks has been a long-standing goal in …

Textcraftor: Your text encoder can be image quality controller

Y Li, X Liu, A Kag, J Hu, Y Idelbayev… - Proceedings of the …, 2024 - openaccess.thecvf.com
Diffusion-based text-to-image generative models eg Stable Diffusion have revolutionized the
field of content generation enabling significant advancements in areas like image editing …

Handiffuser: Text-to-image generation with realistic hand appearances

S Narasimhaswamy, U Bhattacharya… - Proceedings of the …, 2024 - openaccess.thecvf.com
Text-to-image generative models can generate high-quality humans but realism is lost when
generating hands. Common artifacts include irregular hand poses shapes incorrect numbers …

Self-discovering interpretable diffusion latent directions for responsible text-to-image generation

H Li, C Shen, P Torr, V Tresp… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Diffusion-based models have gained significant popularity for text-to-image generation due
to their exceptional image-generation capabilities. A risk with these models is the potential …

Clip in medical imaging: A comprehensive survey

Z Zhao, Y Liu, H Wu, M Wang, Y Li, S Wang… - arxiv preprint arxiv …, 2023 - arxiv.org
Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training
paradigm, successfully introduces text supervision to vision models. It has shown promising …

Foundational models in medical imaging: A comprehensive survey and future vision

B Azad, R Azad, S Eskandari, A Bozorgpour… - arxiv preprint arxiv …, 2023 - arxiv.org
Foundation models, large-scale, pre-trained deep-learning models adapted to a wide range
of downstream tasks have gained significant interest lately in various deep-learning …