Transfer learning and its extensive appositeness in human activity recognition: A survey
In this competitive world, the supervision and monitoring of human resources are primary
and necessary tasks to drive context-aware applications. Advancement in sensor and …
and necessary tasks to drive context-aware applications. Advancement in sensor and …
Expanding language-image pretrained models for general video recognition
Contrastive language-image pretraining has shown great success in learning visual-textual
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
joint representation from web-scale data, demonstrating remarkable “zero-shot” …
Fine-tuned clip models are efficient video learners
Large-scale multi-modal training with image-text pairs imparts strong generalization to CLIP
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …
model. Since training on a similar scale for videos is infeasible, recent approaches focus on …
TN-ZSTAD: Transferable network for zero-shot temporal activity detection
An integral part of video analysis and surveillance is temporal activity detection, which
means to simultaneously recognize and localize activities in long untrimmed videos …
means to simultaneously recognize and localize activities in long untrimmed videos …
Vita-clip: Video and text adaptive clip via multimodal prompting
Adopting contrastive image-text pretrained models like CLIP towards video classification has
gained attention due to its cost-effectiveness and competitive performance. However, recent …
gained attention due to its cost-effectiveness and competitive performance. However, recent …
A comprehensive study of deep video action recognition
Video action recognition is one of the representative tasks for video understanding. Over the
last decade, we have witnessed great advancements in video action recognition thanks to …
last decade, we have witnessed great advancements in video action recognition thanks to …
Latent embedding feedback and discriminative features for zero-shot classification
Zero-shot learning strives to classify unseen categories for which no data is available during
training. In the generalized variant, the test samples can further belong to seen or unseen …
training. In the generalized variant, the test samples can further belong to seen or unseen …
[HTML][HTML] Deep learning innovations in video classification: A survey on techniques and dataset evaluations
Video classification has achieved remarkable success in recent years, driven by advanced
deep learning models that automatically categorize video content. This paper provides a …
deep learning models that automatically categorize video content. This paper provides a …
Elaborative rehearsal for zero-shot action recognition
The growing number of action classes has posed a new challenge for video understanding,
making Zero-Shot Action Recognition (ZSAR) a thriving direction. The ZSAR task aims to …
making Zero-Shot Action Recognition (ZSAR) a thriving direction. The ZSAR task aims to …
Hidden two-stream convolutional networks for action recognition
Analyzing videos of human actions involves understanding the temporal relationships
among video frames. State-of-the-art action recognition approaches rely on traditional …
among video frames. State-of-the-art action recognition approaches rely on traditional …