Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
E5-v: Universal embeddings with multimodal large language models
Multimodal large language models (MLLMs) have shown promising advancements in
general visual and language understanding. However, the representation of multimodal …
general visual and language understanding. However, the representation of multimodal …
Learning commonality, divergence and variety for unsupervised visible-infrared person re-identification
Unsupervised visible-infrared person re-identification (USVI-ReID) aims to match specified
people in infrared images to visible images without annotations, and vice versa. USVI-ReID …
people in infrared images to visible images without annotations, and vice versa. USVI-ReID …
Progressive multimodal reasoning via active retrieval
Multi-step multimodal reasoning tasks pose significant challenges for multimodal large
language models (MLLMs), and finding effective ways to enhance their performance in such …
language models (MLLMs), and finding effective ways to enhance their performance in such …
When Text Embedding Meets Large Language Model: A Comprehensive Survey
Text embedding has become a foundational technology in natural language processing
(NLP) during the deep learning era, driving advancements across a wide array of …
(NLP) during the deep learning era, driving advancements across a wide array of …
InfiR: Crafting Effective Small Language Models and Multimodal Small Language Models in Reasoning
C **e, S Cai, W Wang, P Li, Z Sang, K Yang… - arxiv preprint arxiv …, 2025 - arxiv.org
Large Language Models (LLMs) and Multimodal Large Language Models (MLLMs) have
made significant advancements in reasoning capabilities. However, they still face …
made significant advancements in reasoning capabilities. However, they still face …
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval
Despite the rapidly growing demand for multimodal retrieval, progress in this field remains
severely constrained by a lack of training data. In this paper, we introduce MegaPairs, a …
severely constrained by a lack of training data. In this paper, we introduce MegaPairs, a …
O1 Embedder: Let Retrievers Think Before Action
The growing power of large language models (LLMs) has revolutionized how people access
and utilize information. Notably, the LLMs excel at performing fine-grained data …
and utilize information. Notably, the LLMs excel at performing fine-grained data …
Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval
With the popularity of multimodal techniques, it receives growing interests to acquire useful
information in visual forms. In this work, we formally define an emerging IR paradigm …
information in visual forms. In this work, we formally define an emerging IR paradigm …
GME: Improving Universal Multimodal Retrieval by Multimodal LLMs
Universal Multimodal Retrieval (UMR) aims to enable search across various modalities
using a unified model, where queries and candidates can consist of pure text, images, or a …
using a unified model, where queries and candidates can consist of pure text, images, or a …
Fine-grained Video-Text Retrieval: A New Benchmark and Method
The ability of perceiving fine-grained spatial and temporal information is crucial for video-
language retrieval. However, the existing video retrieval benchmarks, such as MSRVTT and …
language retrieval. However, the existing video retrieval benchmarks, such as MSRVTT and …