- Academic Search

Z Han, C Gao, J Liu, J Zhang, SQ Zhang - arxiv preprint arxiv:2403.14608, 2024 - arxiv.org

Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …

Gem Citer Citeret af 264 Relaterede artikler Alle 4 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

A comprehensive study of ChatGPT: advancements, limitations, and ethical considerations in natural language processing and cybersecurity

M Alawida, S Mejri, A Mehmood, B Chikhaoui… - Information, 2023 - mdpi.com

This paper presents an in-depth study of ChatGPT, a state-of-the-art language model that is
revolutionizing generative text. We provide a comprehensive analysis of its architecture …

Gem Citer Citeret af 169 Relaterede artikler Alle 10 versioner Cached

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Chatcad: Interactive computer-aided diagnosis on medical image using large language models

S Wang, Z Zhao, X Ouyang, Q Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have recently demonstrated their potential in clinical
applications, providing valuable medical knowledge and advice. For example, a large dialog …

Gem Citer Citeret af 198 Relaterede artikler Alle 2 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Interactive and explainable region-guided radiology report generation

T Tanida, P Müller, G Kaissis… - Proceedings of the …, 2023 - openaccess.thecvf.com

The automatic generation of radiology reports has the potential to assist radiologists in the
time-consuming task of report writing. Existing methods generate the full report from image …

Gem Citer Citeret af 136 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tip-adapter: Training-free adaption of clip for few-shot classification

R Zhang, W Zhang, R Fang, P Gao, K Li, J Dai… - European conference on …, 2022 - Springer

Abstract Contrastive Vision-Language Pre-training, known as CLIP, has provided a new
paradigm for learning visual representations using large-scale image-text pairs. It shows …

Gem Citer Citeret af 346 Relaterede artikler Alle 8 versioner

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Vision-language pre-training: Basics, recent advances, and future trends

Z Gan, L Li, C Li, L Wang, Z Liu… - Foundations and Trends …, 2022 - nowpublishers.com

This monograph surveys vision-language pre-training (VLP) methods for multimodal
intelligence that have been developed in the last few years. We group these approaches …

Gem Citer Citeret af 198 Relaterede artikler Alle 7 versioner Bibliotekssøgning Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multimodal large language models: A survey

J Wu, W Gan, Z Chen, S Wan… - 2023 IEEE International …, 2023 - ieeexplore.ieee.org

The exploration of multimodal language models integrates multiple data types, such as
images, text, language, audio, and other heterogeneity. While the latest large language …

Gem Citer Citeret af 179 Relaterede artikler Alle 5 versioner

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

4d-fy: Text-to-4d generation using hybrid score distillation sampling

S Bahmani, I Skorokhodov, V Rong… - Proceedings of the …, 2024 - openaccess.thecvf.com

Recent breakthroughs in text-to-4D generation rely on pre-trained text-to-image and text-to-
video models to generate dynamic 3D scenes. However current text-to-4D methods face a …

Gem Citer Citeret af 79 Relaterede artikler Alle 12 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Metransformer: Radiology report generation by transformer with multiple learnable expert tokens

Z Wang, L Liu, L Wang, L Zhou - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

In clinical scenarios, multi-specialist consultation could significantly benefit the diagnosis,
especially for intricate cases. This inspires us to explore a" multi-expert joint diagnosis" …

Gem Citer Citeret af 99 Relaterede artikler Alle 7 versioner Vis som HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tip-adapter: Training-free clip-adapter for better vision-language modeling

R Zhang, R Fang, W Zhang, P Gao, K Li, J Dai… - arxiv preprint arxiv …, 2021 - arxiv.org

Contrastive Vision-Language Pre-training, known as CLIP, has provided a new paradigm for
learning visual representations by using large-scale contrastive image-text pairs. It shows …

Gem Citer Citeret af 407 Relaterede artikler Alle 2 versioner Vis som HTML

Opret underretning

Citer

Avanceret søgning

Gemt i Min samling

Image captioning with semantic attention

Parameter-efficient fine-tuning for large models: A comprehensive survey

A comprehensive study of ChatGPT: advancements, limitations, and ethical considerations in natural language processing and cybersecurity

Chatcad: Interactive computer-aided diagnosis on medical image using large language models

Interactive and explainable region-guided radiology report generation

Tip-adapter: Training-free adaption of clip for few-shot classification

Vision-language pre-training: Basics, recent advances, and future trends

Multimodal large language models: A survey

4d-fy: Text-to-4d generation using hybrid score distillation sampling

Metransformer: Radiology report generation by transformer with multiple learnable expert tokens

Tip-adapter: Training-free clip-adapter for better vision-language modeling