- Academic Search

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Speichern Zitieren Zitiert von: 3540 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui… - Science China …, 2024 - Springer

In this paper, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …

Speichern Zitieren Zitiert von: 348 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] thecvf.com

mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration

Q Ye, H Xu, J Ye, M Yan, A Hu, H Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Abstract Multi-modal Large Language Models (MLLMs) have demonstrated impressive
instruction abilities across various open-ended tasks. However previous methods have …

Speichern Zitieren Zitiert von: 329 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Trustllm: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu, Q Zhang, Y Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs), exemplified by ChatGPT, have gained considerable
attention for their excellent natural language processing capabilities. Nonetheless, these …

Speichern Zitieren Zitiert von: 243 Ähnliche Artikel Alle 4 Versionen HTML-Version

[Free GPT-4]

[PDF] arxiv.org

Minicpm-v: A gpt-4v level mllm on your phone

Y Yao, T Yu, A Zhang, C Wang, J Cui, H Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org

The recent surge of Multimodal Large Language Models (MLLMs) has fundamentally
reshaped the landscape of AI research and industry, shedding light on a promising path …

Speichern Zitieren Zitiert von: 180 Ähnliche Artikel Alle 2 Versionen HTML-Version

[Free GPT-4]

[PDF] ieee.org

Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models

P Xu, W Shao, K Zhang, P Gao, S Liu… - … on Pattern Analysis …, 2024 - ieeexplore.ieee.org

Large Vision-Language Models (LVLMs) have recently played a dominant role in
multimodal vision-language learning. Despite the great success, it lacks a holistic evaluation …

Speichern Zitieren Zitiert von: 184 Ähnliche Artikel Alle 3 Versionen

[Free GPT-4]

[PDF] arxiv.org

Blink: Multimodal large language models can see but not perceive

X Fu, Y Hu, B Li, Y Feng, H Wang, X Lin, D Roth… - … on Computer Vision, 2024 - Springer

We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses
on core visual perception abilities not found in other evaluations. Most of the Blink tasks can …

Speichern Zitieren Zitiert von: 94 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[HTML] mlr.press

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

Y Huang, L Sun, H Wang, S Wu… - International …, 2024 - proceedings.mlr.press

Large language models (LLMs) have gained considerable attention for their excellent
natural language processing capabilities. Nonetheless, these LLMs present many …

Speichern Zitieren Zitiert von: 39 Ähnliche Artikel Im Cache

[Free GPT-4]

[PDF] arxiv.org

When do we not need larger vision models?

B Shi, Z Wu, M Mao, X Wang, T Darrell - European Conference on …, 2024 - Springer

Scaling up the size of vision models has been the de facto standard to obtain more powerful
visual representations. In this work, we discuss the point beyond which larger vision models …

Speichern Zitieren Zitiert von: 40 Ähnliche Artikel Alle 2 Versionen

[Free GPT-4]

[PDF] arxiv.org

Sphinx-x: Scaling data and parameters for a family of multi-modal large language models

D Liu, R Zhang, L Qiu, S Huang, W Lin, S Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org

We propose SPHINX-X, an extensive Multimodality Large Language Model (MLLM) series
developed upon SPHINX. To improve the architecture and training efficiency, we modify the …

Speichern Zitieren Zitiert von: 83 Ähnliche Artikel Alle 3 Versionen HTML-Version

Alert erstellen

Zitieren

Erweiterte Suche

In „Meine Bibliothek“ gespeichert

Opencompass: A universal evaluation platform for foundation models

A survey of large language models

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration

Trustllm: Trustworthiness in large language models

Minicpm-v: A gpt-4v level mllm on your phone

Lvlm-ehub: A comprehensive evaluation benchmark for large vision-language models

Blink: Multimodal large language models can see but not perceive

[HTML][HTML] Position: TrustLLM: Trustworthiness in large language models

When do we not need larger vision models?

Sphinx-x: Scaling data and parameters for a family of multi-modal large language models