Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Open-vocabulary video anomaly detection
Current video anomaly detection (VAD) approaches with weak supervisions are inherently
limited to a closed-set setting and may struggle in open-world applications where there can …
limited to a closed-set setting and may struggle in open-world applications where there can …
Audio-visual segmentation via unlabeled frame exploitation
Audio-visual segmentation (AVS) aims to segment the sounding objects in video frames.
Although great progress has been witnessed we experimentally reveal that current methods …
Although great progress has been witnessed we experimentally reveal that current methods …
Attrseg: open-vocabulary semantic segmentation via attribute decomposition-aggregation
Open-vocabulary semantic segmentation is a challenging task that requires segmenting
novel object categories at inference time. Recent works explore vision-language pre-training …
novel object categories at inference time. Recent works explore vision-language pre-training …
Distilling vision-language pre-training to collaborate with weakly-supervised temporal action localization
Weakly-supervised temporal action localization (WTAL) learns to detect and classify action
instances with only category labels. Most methods widely adopt the off-the-shelf …
instances with only category labels. Most methods widely adopt the off-the-shelf …
Turbo: Informativity-driven acceleration plug-in for vision-language large models
Abstract Vision-Language Large Models (VLMs) recently become primary backbone of AI,
due to the impressive performance. However, their expensive computation costs, ie …
due to the impressive performance. However, their expensive computation costs, ie …
Zero-shot temporal action detection by learning multimodal prompts and text-enhanced actionness
Zero-shot temporal action detection (ZS-TAD), aiming to recognize and detect new and
unseen video actions, is an emerging and challenging task with limited solutions. Recent …
unseen video actions, is an emerging and challenging task with limited solutions. Recent …
Denoiser: Rethinking the robustness for open-vocabulary action recognition
As one of the fundamental video tasks in computer vision, Open-Vocabulary Action
Recognition (OVAR) recently gains increasing attention, with the development of vision …
Recognition (OVAR) recently gains increasing attention, with the development of vision …
Turbo: informativity-driven acceleration plug-in for vision-language models
Vision-Language Large Models (VLMs) have become primary backbone of AI, due to the
impressive performance. However, their expensive computation costs, ie, throughput and …
impressive performance. However, their expensive computation costs, ie, throughput and …
Advancing Myopia To Holism: Fully Contrastive Language-Image Pre-training
In rapidly evolving field of vision-language models (VLMs), contrastive language-image pre-
training (CLIP) has made significant strides, becoming foundation for various downstream …
training (CLIP) has made significant strides, becoming foundation for various downstream …
Com-STAL: Compositional spatio-temporal action localization
Spatio-temporal action localization aims to locate the spatial and temporal positions of
actors and classify their actions. However, prior research overlooks the fact that human …
actors and classify their actions. However, prior research overlooks the fact that human …