Mm-llms: Recent advances in multimodal large language models
In the past year, MultiModal Large Language Models (MM-LLMs) have undergone
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …
substantial advancements, augmenting off-the-shelf LLMs to support MM inputs or outputs …
Survey on factuality in large language models: Knowledge, retrieval and domain-specificity
This survey addresses the crucial issue of factuality in Large Language Models (LLMs). As
LLMs find applications across diverse domains, the reliability and accuracy of their outputs …
LLMs find applications across diverse domains, the reliability and accuracy of their outputs …
Mmbench: Is your multi-modal model an all-around player?
Large vision-language models (VLMs) have recently achieved remarkable progress,
exhibiting impressive multimodal perception and reasoning abilities. However, effectively …
exhibiting impressive multimodal perception and reasoning abilities. However, effectively …
How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites
In this paper, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
Internlm-xcomposer2-4khd: A pioneering large vision-language model handling resolutions from 336 pixels to 4k hd
Abstract The Large Vision-Language Model (LVLM) field has seen significant
advancements, yet its progression has been hindered by challenges in comprehending fine …
advancements, yet its progression has been hindered by challenges in comprehending fine …
Llamafactory: Unified efficient fine-tuning of 100+ language models
Efficient fine-tuning is vital for adapting large language models (LLMs) to downstream tasks.
However, it requires non-trivial efforts to implement these methods on different models. We …
However, it requires non-trivial efforts to implement these methods on different models. We …
Chatlaw: Open-source legal large language model with integrated external knowledge bases
AI legal assistants based on Large Language Models (LLMs) can provide accessible legal
consulting services, but the hallucination problem poses potential legal risks. This paper …
consulting services, but the hallucination problem poses potential legal risks. This paper …
Pointllm: Empowering large language models to understand point clouds
The unprecedented advancements in Large Language Models (LLMs) have shown a
profound impact on natural language processing but are yet to fully embrace the realm of 3D …
profound impact on natural language processing but are yet to fully embrace the realm of 3D …
Internlm-xcomposer: A vision-language large model for advanced text-image comprehension and composition
We propose InternLM-XComposer, a vision-language large model that enables advanced
image-text comprehension and composition. The innovative nature of our model is …
image-text comprehension and composition. The innovative nature of our model is …
Blink: Multimodal large language models can see but not perceive
We introduce Blink, a new benchmark for multimodal language models (LLMs) that focuses
on core visual perception abilities not found in other evaluations. Most of the Blink tasks can …
on core visual perception abilities not found in other evaluations. Most of the Blink tasks can …