Combating misinformation in the age of llms: Opportunities and challenges
Misinformation such as fake news and rumors is a serious threat for information ecosystems
and public trust. The emergence of large language models (LLMs) has great potential to …
and public trust. The emergence of large language models (LLMs) has great potential to …
A Survey of Multimodel Large Language Models
Z Liang, Y Xu, Y Hong, P Shang, Q Wang… - Proceedings of the 3rd …, 2024 - dl.acm.org
With the widespread application of the Transformer architecture in various modalities,
including vision, the technology of large language models is evolving from a single modality …
including vision, the technology of large language models is evolving from a single modality …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites
In this paper, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi
We introduce MMMU: a new benchmark designed to evaluate multimodal models on
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
A survey on hallucination in large language models: Principles, taxonomy, challenges, and open questions
The emergence of large language models (LLMs) has marked a significant breakthrough in
natural language processing (NLP), fueling a paradigm shift in information acquisition …
natural language processing (NLP), fueling a paradigm shift in information acquisition …
Eyes wide shut? exploring the visual shortcomings of multimodal llms
Is vision good enough for language? Recent advancements in multimodal models primarily
stem from the powerful reasoning abilities of large language models (LLMs). However the …
stem from the powerful reasoning abilities of large language models (LLMs). However the …
Fine-tuning large vision-language models as decision-making agents via reinforcement learning
Large vision-language models (VLMs) fine-tuned on specialized visual instruction-following
data have exhibited impressive language reasoning capabilities across various scenarios …
data have exhibited impressive language reasoning capabilities across various scenarios …
Rlhf-v: Towards trustworthy mllms via behavior alignment from fine-grained correctional human feedback
Abstract Multimodal Large Language Models (MLLMs) have recently demonstrated
impressive capabilities in multimodal understanding reasoning and interaction. However …
impressive capabilities in multimodal understanding reasoning and interaction. However …
Chat-univi: Unified visual representation empowers large language models with image and video understanding
Large language models have demonstrated impressive universal capabilities across a wide
range of open-ended tasks and have extended their utility to encompass multimodal …
range of open-ended tasks and have extended their utility to encompass multimodal …