Llava-onevision: Easy visual task transfer
We present LLaVA-OneVision, a family of open large multimodal models (LMMs) developed
by consolidating our insights into data, models, and visual representations in the LLaVA …
by consolidating our insights into data, models, and visual representations in the LLaVA …
[HTML][HTML] A survey on the use of large language models (llms) in fake news
The proliferation of fake news and fake profiles on social media platforms poses significant
threats to information integrity and societal trust. Traditional detection methods, including …
threats to information integrity and societal trust. Traditional detection methods, including …
RULER: What's the Real Context Size of Your Long-Context Language Models?
The needle-in-a-haystack (NIAH) test, which examines the ability to retrieve a piece of
information (the" needle") from long distractor texts (the" haystack"), has been widely …
information (the" needle") from long distractor texts (the" haystack"), has been widely …
Qwen2. 5-coder technical report
In this report, we introduce the Qwen2. 5-Coder series, a significant upgrade from its
predecessor, CodeQwen1. 5. This series includes six models: Qwen2. 5-Coder-(0.5 B/1.5 …
predecessor, CodeQwen1. 5. This series includes six models: Qwen2. 5-Coder-(0.5 B/1.5 …
Video instruction tuning with synthetic data
The development of video large multimodal models (LMMs) has been hindered by the
difficulty of curating large amounts of high-quality raw data from the web. To address this, we …
difficulty of curating large amounts of high-quality raw data from the web. To address this, we …
Molmo and pixmo: Open weights and open data for state-of-the-art multimodal models
Today's most advanced multimodal models remain proprietary. The strongest open-weight
models rely heavily on synthetic data from proprietary VLMs to achieve good performance …
models rely heavily on synthetic data from proprietary VLMs to achieve good performance …
Qwen2. 5-math technical report: Toward mathematical expert model via self-improvement
In this report, we present a series of math-specific large language models: Qwen2. 5-Math
and Qwen2. 5-Math-Instruct-1.5 B/7B/72B. The core innovation of the Qwen2. 5 series lies in …
and Qwen2. 5-Math-Instruct-1.5 B/7B/72B. The core innovation of the Qwen2. 5 series lies in …
mplug-owl3: Towards long image-sequence understanding in multi-modal large language models
Multi-modal Large Language Models (MLLMs) have demonstrated remarkable capabilities
in executing instructions for a variety of single-image tasks. Despite this progress, significant …
in executing instructions for a variety of single-image tasks. Despite this progress, significant …
Large language model inference acceleration: A comprehensive hardware perspective
Large Language Models (LLMs) have demonstrated remarkable capabilities across various
fields, from natural language understanding to text generation. Compared to non-generative …
fields, from natural language understanding to text generation. Compared to non-generative …
Graph retrieval-augmented generation: A survey
Recently, Retrieval-Augmented Generation (RAG) has achieved remarkable success in
addressing the challenges of Large Language Models (LLMs) without necessitating …
addressing the challenges of Large Language Models (LLMs) without necessitating …