- Academic Search

BC Das, MH Amini, Y Wu - ACM Computing Surveys, 2025 - dl.acm.org

Large language models (LLMs) have demonstrated extraordinary capabilities and
contributed to multiple fields, such as generating and summarizing text, language …

Lưu Trích dẫn Trích dẫn 696 bài viết Bài viết có liên quan Tất cả 11 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] oup.com

A Survey of Multimodel Large Language Models

Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org

With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …

Lưu Trích dẫn Trích dẫn 1248 bài viết Bài viết có liên quan Tất cả 12 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Improved baselines with visual instruction tuning

H Liu, C Li, Y Li, YJ Lee - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

Large multimodal models (LMM) have recently shown encouraging progress with visual
instruction tuning. In this paper we present the first systematic study to investigate the design …

Lưu Trích dẫn Trích dẫn 1897 bài viết Bài viết có liên quan Tất cả 10 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

L Huang, W Yu, W Ma, W Zhong, Z Feng… - ACM Transactions on …, 2025 - dl.acm.org

The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …

Lưu Trích dẫn Trích dẫn 963 bài viết Bài viết có liên quan Tất cả 4 phiên bản

Siren's song in the AI ocean: a survey on hallucination in large language models

Y Zhang, Y Li, L Cui, D Cai, L Liu, T Fu… - arxiv preprint arxiv …, 2023 - arxiv.org

While large language models (LLMs) have demonstrated remarkable capabilities across a
range of downstream tasks, a significant concern revolves around their propensity to exhibit …

Lưu Trích dẫn Trích dẫn 993 bài viết Bài viết có liên quan Tất cả 2 phiên bản Bản lưu

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Lisa: Reasoning segmentation via large language model

X Lai, Z Tian, Y Chen, Y Li, Y Yuan… - Proceedings of the …, 2024 - openaccess.thecvf.com

Although perception systems have made remarkable advancements in recent years they still
rely on explicit human instruction or pre-defined categories to identify the target objects …

Lưu Trích dẫn Trích dẫn 405 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com

Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Lưu Trích dẫn Trích dẫn 231 bài viết Bài viết có liên quan Tất cả 7 phiên bản Tìm kiếm Thư viện Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

Z Chen, W Wang, H Tian, S Ye, Z Gao, E Cui… - Science China …, 2024 - Springer

In this paper, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …

Lưu Trích dẫn Trích dẫn 392 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks

Z Chen, J Wu, W Wang, W Su, G Chen… - Proceedings of the …, 2024 - openaccess.thecvf.com

The exponential growth of large language models (LLMs) has opened up numerous
possibilities for multi-modal AGI systems. However the progress in vision and vision …

Lưu Trích dẫn Trích dẫn 185 bài viết Bài viết có liên quan Tất cả 8 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Generative multimodal models are in-context learners

Q Sun, Y Cui, X Zhang, F Zhang, Q Yu… - Proceedings of the …, 2024 - openaccess.thecvf.com

Humans can easily solve multimodal tasks in context with only a few demonstrations or
simple instructions which current multimodal systems largely struggle to imitate. In this work …

Lưu Trích dẫn Trích dẫn 209 bài viết Bài viết có liên quan Tất cả 6 phiên bản Xem dạng HTML

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Instruction tuning for large language models: A survey

Security and privacy challenges of large language models: A survey

A Survey of Multimodel Large Language Models

Improved baselines with visual instruction tuning

A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions

Siren's song in the AI ocean: a survey on hallucination in large language models

Lisa: Reasoning segmentation via large language model

Multimodal foundation models: From specialists to general-purpose assistants

How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites

Internvl: Scaling up vision foundation models and aligning for generic visual-linguistic tasks

Generative multimodal models are in-context learners