Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Adashield: Safeguarding multimodal large language models from structure-based attack via adaptive shield prompting
With the advent and widespread deployment of Multimodal Large Language Models
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …
(MLLMs), the imperative to ensure their safety has become increasingly pronounced …
Video understanding with large language models: A survey
With the burgeoning growth of online video platforms and the escalating volume of video
content, the demand for proficient video understanding tools has intensified markedly. Given …
content, the demand for proficient video understanding tools has intensified markedly. Given …
Gemini pro defeated by gpt-4v: Evidence from education
This study compared the classification performance of Gemini Pro and GPT-4V in
educational settings. Employing visual question answering (VQA) techniques, the study …
educational settings. Employing visual question answering (VQA) techniques, the study …
GPT4Vis: what can GPT-4 do for zero-shot visual recognition?
This paper does not present a novel method. Instead, it delves into an essential, yet must-
know baseline in light of the latest advancements in Generative Artificial Intelligence …
know baseline in light of the latest advancements in Generative Artificial Intelligence …
Videovista: A versatile benchmark for video understanding and reasoning
Despite significant breakthroughs in video analysis driven by the rapid development of large
multimodal models (LMMs), there remains a lack of a versatile evaluation benchmark to …
multimodal models (LMMs), there remains a lack of a versatile evaluation benchmark to …
Cocot: Contrastive chain-of-thought prompting for large multimodal models with multiple image inputs
When exploring the development of Artificial General Intelligence (AGI), a critical task for
these models involves interpreting and processing information from multiple image inputs …
these models involves interpreting and processing information from multiple image inputs …
Visual-roleplay: Universal jailbreak attack on multimodal large language models via role-playing image character
With the advent and widespread deployment of Multimodal Large Language Models
(MLLMs), ensuring their safety has become increasingly critical. To achieve this objective, it …
(MLLMs), ensuring their safety has become increasingly critical. To achieve this objective, it …
Fakingrecipe: Detecting fake news on short video platforms from the perspective of creative process
As short-form video-sharing platforms become a significant channel for news consumption,
fake news in short videos has emerged as a serious threat in the online information …
fake news in short videos has emerged as a serious threat in the online information …
GPT4Ego: unleashing the potential of pre-trained models for zero-shot egocentric action recognition
Vision-Language Models (VLMs), pre-trained on large-scale datasets, have shown
impressive performance in various visual recognition tasks. This advancement paves the …
impressive performance in various visual recognition tasks. This advancement paves the …
Machine-generated text localization
Machine-Generated Text (MGT) detection aims to identify a piece of text as machine or
human written. Prior work has primarily formulated MGT detection as a binary classification …
human written. Prior work has primarily formulated MGT detection as a binary classification …