- Academic Search

G Donahue, E Elhamifar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

In this paper we tackle the problem of self-supervised video alignment and activity progress
prediction using in-the-wild videos. Our proposed self-supervised representation learning …

Save Cite Cited by 6 Related articles All 2 versions Free GPT-4 View as HTML

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

C Su, J Wei, D Lin, L Kong, YL Guan - Pattern Analysis and Applications, 2024 - Springer

Three-dimensional convolutional neural networks (3D-CNNs) and full connection long short-
term memory networks (FC-LSTMs) have been demonstrated as a kind of powerful non …

Save Cite Cited by 5 Related articles All 2 versions Free GPT-4

Unsupervised prototype self-calibration based on hybrid attention contrastive learning for enhanced few-shot action recognition

Y An, Y Yi, L Wu, Y Cao, D Zhou, Y Yuan, B Liu… - Applied Soft …, 2025 - Elsevier

The collection and annotation of large-scale video data pose significant challenges,
prompting the exploration of few-shot models to recognize unseen actions with limited …

Self-Supervised 3-D Action Recognition by Contrasting Context-Enhanced Action Embeddings

K Ye, BN Zhao, S Liang, H Yao… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org

3-D action recognition become a fast-pacing field in recent years. However, traditional
approaches have limitations. They either focus on modeling overly detailed yet redundant …

Save Cite Related articles

[Free GPT-4]

[PDF] arxiv.org

Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition

W Zhao, W Zhou, H Hu, M Wang, H Li - arxiv preprint arxiv:2406.10501, 2024 - arxiv.org

Recently, there have been efforts to improve the performance in sign language recognition
by designing self-supervised learning methods. However, these methods capture limited …

Save Cite Cited by 1 Related articles View as HTML

Spatiotemporal feature enhancement network for action recognition

G Huang, X Wang, X Li, Y Wang - Multimedia Tools and Applications, 2024 - Springer

As a hot topic in the field of computer vision, video action recognition has great application
potential, such as intelligent monitoring, data recommendation and virtual reality. However …

Save Cite Related articles

[Free GPT-4]

[PDF] preprints.org

[PDF][PDF] Unifying Video Self-Supervised Learning across Families of Tasks: A Survey

I Dave, M Gunawardhana, L Sadith, H Zhou, L David… - Preprints, 2024 - preprints.org

Video self-supervised learning (VideoSSL) offers significant potential for reducing
annotation costs and enhancing a wide range of downstream tasks in video understanding …

[Free GPT-4]

[PDF] ssrn.com

CFI-Former: Cross-Feature Interaction Transformer for Group Activity Recognition

X Zhu, Y Zhou - Yan, CFI-Former: Cross-Feature Interaction …, 2023 - papers.ssrn.com

Group activity recognition (GAR) is a significant and challenging task that has attracted
considerable attention in video analysis. However, most of the existing models directly …

Save Cite Related articles View as HTML

[Free GPT-4]

[PDF] ssrn.com

Unsupervised Prototype Self-Calibration Spatio-Temporal Attention Network for Enhanced Few-Shot Action Recognition

Y Yi, L Wu, Y Yuan, B Liu, Y Li, CY Su - papers.ssrn.com

The collection and annotation of large-scale video data pose significant challenges,
prompting the exploration of few-shot learning models capable of recognizing unseen …

Save Cite Related articles View as HTML

Create alert

Cite

Advanced search

Saved to My library

Self-supervised video-based action recognition with disturbances

Learning to predict activity progress by self-supervised video alignment

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

Unsupervised prototype self-calibration based on hybrid attention contrastive learning for enhanced few-shot action recognition

Self-Supervised 3-D Action Recognition by Contrasting Context-Enhanced Action Embeddings

Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition

Spatiotemporal feature enhancement network for action recognition

[PDF][PDF] Unifying Video Self-Supervised Learning across Families of Tasks: A Survey

CFI-Former: Cross-Feature Interaction Transformer for Group Activity Recognition

Unsupervised Prototype Self-Calibration Spatio-Temporal Attention Network for Enhanced Few-Shot Action Recognition