Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Image-text retrieval: A survey on recent research and development
In the past few years, cross-modal image-text retrieval (ITR) has experienced increased
interest in the research community due to its excellent research value and broad real-world …
interest in the research community due to its excellent research value and broad real-world …
Cross-modal retrieval: a systematic review of methods and future directions
With the exponential surge in diverse multimodal data, traditional unimodal retrieval
methods struggle to meet the needs of users seeking access to data across various …
methods struggle to meet the needs of users seeking access to data across various …
Fashionvlp: Vision language transformer for fashion retrieval with feedback
Fashion image retrieval based on a query pair of reference image and natural language
feedback is a challenging task that requires models to assess fashion related information …
feedback is a challenging task that requires models to assess fashion related information …
Vista: Vision and scene text aggregation for cross-modal retrieval
Visual appearance is considered to be the most important cue to understand images for
cross-modal retrieval, while sometimes the scene text appearing in images can provide …
cross-modal retrieval, while sometimes the scene text appearing in images can provide …
A large cross-modal video retrieval dataset with reading comprehension
Most existing cross-modal language-to-video retrieval (VR) research focuses on single-
modal input from video, ie, visual representation, while the text is omnipresent in human …
modal input from video, ie, visual representation, while the text is omnipresent in human …
Mmpedia: A large-scale multi-modal knowledge graph
Abstract Knowledge graphs serve as crucial resources for various applications. However,
most existing knowledge graphs present symbolic knowledge in the form of natural …
most existing knowledge graphs present symbolic knowledge in the form of natural …
Ocr-idl: Ocr annotations for industry document library dataset
Pretraining has proven successful in Document Intelligence tasks where deluge of
documents are used to pretrain the models only later to be finetuned on downstream tasks …
documents are used to pretrain the models only later to be finetuned on downstream tasks …
Is an image worth five sentences? A new look into semantics for image-text matching
The task of image-text matching aims to map representations from different modalities into a
common joint visual-textual embedding. However, the most widely used datasets for this …
common joint visual-textual embedding. However, the most widely used datasets for this …
Bcra: bidirectional cross-modal implicit relation reasoning and aligning for text-to-image person retrieval
Z Li, Y **e - Multimedia Systems, 2024 - Springer
Text-to-image person retrieval aims to retrieve relevant target individuals based on given
textual descriptions. The main challenge faced by this task is how to better combine and …
textual descriptions. The main challenge faced by this task is how to better combine and …
Adaptive transformer-based conditioned variational autoencoder for incomplete social event classification
With the rapid development of the Internet and the expanding scale of social media,
incomplete social event classification has increasingly become a challenging task. The key …
incomplete social event classification has increasingly become a challenging task. The key …