Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Mme-survey: A comprehensive survey on evaluation of multimodal llms
As a prominent direction of Artificial General Intelligence (AGI), Multimodal Large Language
Models (MLLMs) have garnered increased attention from both industry and academia …
Models (MLLMs) have garnered increased attention from both industry and academia …
LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via a Hybrid Architecture
Expanding the long-context capabilities of Multi-modal Large Language Models~(MLLMs) is
crucial for video understanding, high-resolution image understanding, and multi-modal …
crucial for video understanding, high-resolution image understanding, and multi-modal …
Video-xl: Extra-long vision language model for hour-scale video understanding
Although current Multi-modal Large Language Models (MLLMs) demonstrate promising
results in video understanding, processing extremely long videos remains an ongoing …
results in video understanding, processing extremely long videos remains an ongoing …
A survey on multimodal benchmarks: In the era of large ai models
The rapid evolution of Multimodal Large Language Models (MLLMs) has brought substantial
advancements in artificial intelligence, significantly enhancing the capability to understand …
advancements in artificial intelligence, significantly enhancing the capability to understand …
Visual Haystacks: A Vision-Centric Needle-In-A-Haystack Benchmark
Large Multimodal Models (LMMs) have made significant strides in visual question-
answering for single images. Recent advancements like long-context LMMs have allowed …
answering for single images. Recent advancements like long-context LMMs have allowed …
Videochat-flash: Hierarchical compression for long-context video modeling
Long-context modeling is a critical capability for multimodal large language models
(MLLMs), enabling them to process long-form contents with implicit memorization. Despite …
(MLLMs), enabling them to process long-form contents with implicit memorization. Despite …
InternVideo2. 5: Empowering Video MLLMs with Long and Rich Context Modeling
This paper aims to improve the performance of video multimodal large language models
(MLLM) via long and rich context (LRC) modeling. As a result, we develop a new version of …
(MLLM) via long and rich context (LRC) modeling. As a result, we develop a new version of …
VCBench: A Controllable Benchmark for Symbolic and Abstract Challenges in Video Cognition
Recent advancements in Large Video-Language Models (LVLMs) have driven the
development of benchmarks designed to assess cognitive abilities in video-based tasks …
development of benchmarks designed to assess cognitive abilities in video-based tasks …
VRoPE: Rotary Position Embedding for Video Large Language Models
Rotary Position Embedding (RoPE) has shown strong performance in text-based Large
Language Models (LLMs), but extending it to video remains a challenge due to the intricate …
Language Models (LLMs), but extending it to video remains a challenge due to the intricate …