Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Expanding performance boundaries of open-source multimodal models with model, data, and test-time scaling
We introduce InternVL 2.5, an advanced multimodal large language model (MLLM) series
that builds upon InternVL 2.0, maintaining its core model architecture while introducing …
that builds upon InternVL 2.0, maintaining its core model architecture while introducing …
Scaling laws for precision
Low precision training and inference affect both the quality and cost of language models, but
current scaling laws do not account for this. In this work, we devise" precision-aware" scaling …
current scaling laws do not account for this. In this work, we devise" precision-aware" scaling …
Deepseek-vl2: Mixture-of-experts vision-language models for advanced multimodal understanding
We present DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-
Language Models that significantly improves upon its predecessor, DeepSeek-VL, through …
Language Models that significantly improves upon its predecessor, DeepSeek-VL, through …
Naturalbench: Evaluating vision-language models on natural adversarial samples
Vision-language models (VLMs) have made significant progress in recent visual-question-
answering (VQA) benchmarks that evaluate complex visio-linguistic reasoning. However …
answering (VQA) benchmarks that evaluate complex visio-linguistic reasoning. However …
MEGA-Bench: Scaling multimodal evaluation to over 500 real-world tasks
We present MEGA-Bench, an evaluation suite that scales multimodal evaluation to over 500
real-world tasks, to address the highly heterogeneous daily use cases of end users. Our …
real-world tasks, to address the highly heterogeneous daily use cases of end users. Our …
Leopard: A vision language model for text-rich multi-image tasks
Text-rich images, where text serves as the central visual element guiding the overall
understanding, are prevalent in real-world applications, such as presentation slides …
understanding, are prevalent in real-world applications, such as presentation slides …
Worldcuisines: A massive-scale benchmark for multilingual and multicultural visual question answering on global cuisines
Vision Language Models (VLMs) often struggle with culture-specific knowledge, particularly
in languages other than English and in underrepresented cultural contexts. To evaluate their …
in languages other than English and in underrepresented cultural contexts. To evaluate their …
Can foundation models actively gather information in interactive environments to test hypotheses?
While problem solving is a standard evaluation task for foundation models, a crucial
component of problem solving--actively and strategically gathering information to test …
component of problem solving--actively and strategically gathering information to test …
VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models
Vision-language generative reward models (VL-GenRMs) play a crucial role in aligning and
evaluating multimodal AI systems, yet their own evaluation remains under-explored. Current …
evaluating multimodal AI systems, yet their own evaluation remains under-explored. Current …
What is missing in multilingual visual reasoning and how to fix it
NLP models today strive for supporting multiple languages and modalities, improving
accessibility for diverse users. In this paper, we evaluate their multilingual, multimodal …
accessibility for diverse users. In this paper, we evaluate their multilingual, multimodal …