A comprehensive overview of large language models
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …
natural language processing tasks and beyond. This success of LLMs has led to a large …
Parameter-efficient fine-tuning for large models: A comprehensive survey
Large models represent a groundbreaking advancement in multiple application fields,
enabling remarkable achievements across various tasks. However, their unprecedented …
enabling remarkable achievements across various tasks. However, their unprecedented …
Mmbench: Is your multi-modal model an all-around player?
Large vision-language models (VLMs) have recently achieved remarkable progress,
exhibiting impressive multimodal perception and reasoning abilities. However, effectively …
exhibiting impressive multimodal perception and reasoning abilities. However, effectively …
[PDF][PDF] Qwen-vl: A versatile vision-language model for understanding, localization, text reading, and beyond
In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models
(LVLMs) designed to perceive and understand both texts and images. Starting from the …
(LVLMs) designed to perceive and understand both texts and images. Starting from the …
Video-llava: Learning united visual representation by alignment before projection
The Large Vision-Language Model (LVLM) has enhanced the performance of various
downstream tasks in visual-language understanding. Most existing approaches encode …
downstream tasks in visual-language understanding. Most existing approaches encode …
Are aligned neural networks adversarially aligned?
Large language models are now tuned to align with the goals of their creators, namely to be"
helpful and harmless." These models should respond helpfully to user questions, but refuse …
helpful and harmless." These models should respond helpfully to user questions, but refuse …
How far are we to gpt-4v? closing the gap to commercial multimodal models with open-source suites
In this paper, we introduce InternVL 1.5, an open-source multimodal large language model
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
(MLLM) to bridge the capability gap between open-source and proprietary commercial …
Chameleon: Plug-and-play compositional reasoning with large language models
Large language models (LLMs) have achieved remarkable progress in solving various
natural language processing tasks due to emergent reasoning abilities. However, LLMs …
natural language processing tasks due to emergent reasoning abilities. However, LLMs …
Mmmu: A massive multi-discipline multimodal understanding and reasoning benchmark for expert agi
We introduce MMMU: a new benchmark designed to evaluate multimodal models on
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
massive multi-discipline tasks demanding college-level subject knowledge and deliberate …
Qwen-vl: A frontier large vision-language model with versatile abilities
In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models
(LVLMs) designed to perceive and understand both texts and images. Starting from the …
(LVLMs) designed to perceive and understand both texts and images. Starting from the …