- Academic Search

B Chen, Z Zhang, N Langrené, S Zhu - arxiv preprint arxiv:2310.14735, 2023 - arxiv.org

This comprehensive review delves into the pivotal role of prompt engineering in unleashing
the capabilities of Large Language Models (LLMs). The development of Artificial Intelligence …

Lagre Referanse Sitert av 256 Beslektede artikler Alle 3 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] frontiersin.org

Vision-language models for medical report generation and visual question answering: A review

I Hartsock, G Rasool - Frontiers in Artificial Intelligence, 2024 - frontiersin.org

Medical vision-language models (VLMs) combine computer vision (CV) and natural
language processing (NLP) to analyze visual and textual medical data. Our paper reviews …

Lagre Referanse Sitert av 52 Beslektede artikler Alle 5 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] nature.com

ChatMOF: an artificial intelligence system for predicting and generating metal-organic frameworks using large language models

Y Kang, J Kim - Nature communications, 2024 - nature.com

ChatMOF is an artificial intelligence (AI) system that is built to predict and generate metal-
organic frameworks (MOFs). By leveraging a large-scale language model (GPT-4, GPT-3.5 …

Lagre Referanse Sitert av 59 Beslektede artikler Alle 12 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

Vision language models in autonomous driving: A survey and outlook

X Zhou, M Liu, E Yurtsever, BL Zagar… - IEEE Transactions …, 2024 - ieeexplore.ieee.org

The applications of Vision-Language Models (VLMs) in the field of Autonomous Driving (AD)
have attracted widespread attention due to their outstanding performance and the ability to …

Lagre Referanse Sitert av 76 Beslektede artikler Alle 6 versjoner

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Toward general-purpose robots via foundation models: A survey and meta-analysis

Y Hu, Q **e, V Jain, J Francis, J Patrikar… - arxiv preprint arxiv …, 2023 - arxiv.org

Building general-purpose robots that operate seamlessly in any environment, with any
object, and utilizing various skills to complete diverse tasks has been a long-standing goal in …

Lagre Referanse Sitert av 75 Beslektede artikler Alle 3 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Textcraftor: Your text encoder can be image quality controller

Y Li, X Liu, A Kag, J Hu, Y Idelbayev… - Proceedings of the …, 2024 - openaccess.thecvf.com

Diffusion-based text-to-image generative models eg Stable Diffusion have revolutionized the
field of content generation enabling significant advancements in areas like image editing …

Lagre Referanse Sitert av 16 Beslektede artikler Alle 7 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Handiffuser: Text-to-image generation with realistic hand appearances

S Narasimhaswamy, U Bhattacharya… - Proceedings of the …, 2024 - openaccess.thecvf.com

Text-to-image generative models can generate high-quality humans but realism is lost when
generating hands. Common artifacts include irregular hand poses shapes incorrect numbers …

Lagre Referanse Sitert av 24 Beslektede artikler Alle 5 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Self-discovering interpretable diffusion latent directions for responsible text-to-image generation

H Li, C Shen, P Torr, V Tresp… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com

Diffusion-based models have gained significant popularity for text-to-image generation due
to their exceptional image-generation capabilities. A risk with these models is the potential …

Lagre Referanse Sitert av 30 Beslektede artikler Alle 8 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Clip in medical imaging: A comprehensive survey

Z Zhao, Y Liu, H Wu, M Wang, Y Li, S Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

Contrastive Language-Image Pre-training (CLIP), a simple yet effective pre-training
paradigm, successfully introduces text supervision to vision models. It has shown promising …

Lagre Referanse Sitert av 58 Beslektede artikler Alle 3 versjoner HTML-versjon

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Foundational models in medical imaging: A comprehensive survey and future vision

B Azad, R Azad, S Eskandari, A Bozorgpour… - arxiv preprint arxiv …, 2023 - arxiv.org

Foundation models, large-scale, pre-trained deep-learning models adapted to a wide range
of downstream tasks have gained significant interest lately in various deep-learning …

Lagre Referanse Sitert av 63 Beslektede artikler Alle 3 versjoner HTML-versjon

Opprett varsel

Referanse

Avansert søk

Lagret i Mitt bibliotek

A systematic survey of prompt engineering on vision-language foundation models

Unleashing the potential of prompt engineering in large language models: a comprehensive review

Vision-language models for medical report generation and visual question answering: A review

ChatMOF: an artificial intelligence system for predicting and generating metal-organic frameworks using large language models

Vision language models in autonomous driving: A survey and outlook

Toward general-purpose robots via foundation models: A survey and meta-analysis

Textcraftor: Your text encoder can be image quality controller

Handiffuser: Text-to-image generation with realistic hand appearances

Self-discovering interpretable diffusion latent directions for responsible text-to-image generation

Clip in medical imaging: A comprehensive survey

Foundational models in medical imaging: A comprehensive survey and future vision