Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Transfer learning and its extensive appositeness in human activity recognition: A survey
A Ray, MH Kolekar - Expert Systems with Applications, 2024 - Elsevier
In this competitive world, the supervision and monitoring of human resources are primary
and necessary tasks to drive context-aware applications. Advancement in sensor and …
and necessary tasks to drive context-aware applications. Advancement in sensor and …
Expanding language-image pretrained models for general video recognition
Contrastive language-image pretraining has shown great success in learning visual-textual
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
Fine-tuned clip models are efficient video learners
Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …
Prompting visual-language models for efficient video understanding
Image-based visual-language (I-VL) pre-training has shown great success for learning joint
visual-textual representations from large-scale web data, revealing remarkable ability for …
visual-textual representations from large-scale web data, revealing remarkable ability for …
Actionclip: A new paradigm for video action recognition
The canonical approach to video action recognition dictates a neural model to do a classic
and standard 1-of-N majority vote task. They are trained to predict a fixed set of predefined …
and standard 1-of-N majority vote task. They are trained to predict a fixed set of predefined …
Vita-clip: Video and text adaptive clip via multimodal prompting
Adopting contrastive image-text pretrained models like CLIP towards video classification has
gained attention due to its cost-effectiveness and competitive performance. However, recent …
gained attention due to its cost-effectiveness and competitive performance. However, recent …
Egovlpv2: Egocentric video-language pre-training with fusion in the backbone
Video-language pre-training (VLP) has become increasingly important due to its ability to
generalize to various vision and language tasks. However, existing egocentric VLP …
generalize to various vision and language tasks. However, existing egocentric VLP …
Bidirectional cross-modal knowledge exploration for video recognition with pre-trained vision-language models
Vision-language models (VLMs) pre-trained on large-scale image-text pairs have
demonstrated impressive transferability on various visual tasks. Transferring knowledge …
demonstrated impressive transferability on various visual tasks. Transferring knowledge …
Revisiting classifier: Transferring vision-language models for video recognition
Transferring knowledge from task-agnostic pre-trained deep models for downstream tasks is
an important topic in computer vision research. Along with the growth of computational …
an important topic in computer vision research. Along with the growth of computational …
Omnivid: A generative framework for universal video understanding
The core of video understanding tasks such as recognition captioning and tracking is to
automatically detect objects or actions in a video and analyze their temporal evolution …
automatically detect objects or actions in a video and analyze their temporal evolution …