Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Look-m: Look-once optimization in kv cache for efficient multimodal long-context inference
Long-context Multimodal Large Language Models (MLLMs) demand substantial
computational resources for inference as the growth of their multimodal Key-Value (KV) …
computational resources for inference as the growth of their multimodal Key-Value (KV) …
A survey on multimodal benchmarks: In the era of large ai models
The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial
advancements in artificial intelligence, significantly enhancing the capability to understand …
advancements in artificial intelligence, significantly enhancing the capability to understand …
Vsp: Assessing the dual challenges of perception and reasoning in spatial planning tasks for vlms
Vision language models (VLMs) are an exciting emerging class of language models (LMs)
that have merged classic LM capabilities with those of image processing systems. However …
that have merged classic LM capabilities with those of image processing systems. However …
Multilingual needle in a haystack: Investigating long-context behavior of multilingual large language models
While recent large language models (LLMs) demonstrate remarkable abilities in responding
to queries in diverse languages, their ability to handle long multilingual contexts is …
to queries in diverse languages, their ability to handle long multilingual contexts is …
V2PE: Improving Multimodal Long-Context Capability of Vision-Language Models with Variable Visual Position Encoding
Vision-Language Models (VLMs) have shown promising capabilities in handling various
multimodal tasks, yet they struggle in long-context scenarios, particularly in tasks involving …
multimodal tasks, yet they struggle in long-context scenarios, particularly in tasks involving …
VISTA: Enhancing Long-Duration and High-Resolution Video Understanding by Video Spatiotemporal Augmentation
Current large multimodal models (LMMs) face significant challenges in processing and
comprehending long-duration or high-resolution videos, which is mainly due to the lack of …
comprehending long-duration or high-resolution videos, which is mainly due to the lack of …
MMLU-SR: A Benchmark for Stress-Testing Reasoning Capability of Large Language Models
W Wang, S Jain, P Kantor, J Feldman, L Gallos… - arxiv preprint arxiv …, 2024 - arxiv.org
We propose MMLU-SR, a novel dataset designed to measure the true comprehension
abilities of Large Language Models (LLMs) by challenging their performance in question …
abilities of Large Language Models (LLMs) by challenging their performance in question …
MMMT-IF: A Challenging Multimodal Multi-Turn Instruction Following Benchmark
Evaluating instruction following capabilities for multimodal, multi-turn dialogue is
challenging. With potentially multiple instructions in the input model context, the task is time …
challenging. With potentially multiple instructions in the input model context, the task is time …
VisDoM: Multi-Document QA with Visually Rich Elements Using Multimodal Retrieval-Augmented Generation
Understanding information from a collection of multiple documents, particularly those with
visually rich elements, is important for document-grounded question answering. This paper …
visually rich elements, is important for document-grounded question answering. This paper …
Guided Code Generation with LLMs: A Multi-Agent Framework for Complex Code Tasks
A Almorsi, M Ahmed, W Gomaa - arxiv preprint arxiv:2501.06625, 2025 - arxiv.org
Large Language Models (LLMs) have shown remarkable capabilities in code generation
tasks, yet they face significant limitations in handling complex, long-context programming …
tasks, yet they face significant limitations in handling complex, long-context programming …