Študovňa Google

A Rogers, M Gardner, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org

Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …

Uložiť Citovať Citované 231-krát Súvisiace články Všetky verzie 7

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Generative AI for visualization: State of the art and future directions

Y Ye, J Hao, Y Hou, Z Wang, S **ao, Y Luo, W Zeng - Visual Informatics, 2024 - Elsevier

Generative AI (GenAI) has witnessed remarkable progress in recent years and
demonstrated impressive performance in various generation tasks in different domains such …

Uložiť Citovať Citované 47-krát Súvisiace články Všetky verzie 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

Uložiť Citovať Citované 2522-krát Súvisiace články Všetky verzie 6 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org

Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Uložiť Citovať Citované 2971-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

P Tong, E Brown, P Wu, S Woo… - Advances in …, 2025 - proceedings.neurips.cc

We introduce Cambrian-1, a family of multimodal LLMs (MLLMs) designed with a vision-
centric approach. While stronger language models can enhance multimodal capabilities, the …

Uložiť Citovať Citované 210-krát Súvisiace články Všetky verzie 5 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Phi-3 technical report: A highly capable language model locally on your phone

M Abdin, J Aneja, H Awadalla, A Awadallah… - arxiv preprint arxiv …, 2024 - arxiv.org

We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion
tokens, whose overall performance, as measured by both academic benchmarks and …

Uložiť Citovať Citované 878-krát Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

What matters when building vision-language models?

H Laurençon, L Tronchon, M Cord… - Advances in Neural …, 2025 - proceedings.neurips.cc

The growing interest in vision-language models (VLMs) has been driven by improvements in
large language models and vision transformers. Despite the abundance of literature on this …

Uložiť Citovať Citované 164-krát Súvisiace články Všetky verzie 4 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Cogagent: A visual language model for gui agents

W Hong, W Wang, Q Lv, J Xu, W Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com

People are spending an enormous amount of time on digital devices through graphical user
interfaces (GUIs) eg computer or smartphone screens. Large language models (LLMs) such …

Uložiť Citovať Citované 264-krát Súvisiace články Všetky verzie 6 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui… - Science China …, 2024 - Springer

In this paper, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …

Uložiť Citovať Citované 392-krát Súvisiace články Všetky verzie 4

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd

X Dong, P Zhang, Y Zang, Y Cao… - Advances in …, 2025 - proceedings.neurips.cc

Abstract The Large Vision-Language Model (LVLM) field has seen significant
advancements, yet its progression has been hindered by challenges in comprehending fine …

Uložiť Citovať Citované 114-krát Súvisiace články Všetky verzie 5 HTML verzia

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

Chartqa: A benchmark for question answering about charts with visual and logical reasoning

Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension

[HTML][HTML] Generative AI for visualization: State of the art and future directions

Qwen technical report

The llama 3 herd of models

Cambrian-1: A fully open, vision-centric exploration of multimodal llms

Phi-3 technical report: A highly capable language model locally on your phone

What matters when building vision-language models?

Cogagent: A visual language model for gui agents

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd