Learning to predict activity progress by self-supervised video alignment

G Donahue, E Elhamifar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
In this paper we tackle the problem of self-supervised video alignment and activity progress
prediction using in-the-wild videos. Our proposed self-supervised representation learning …

A novel model for fall detection and action recognition combined lightweight 3D-CNN and convolutional LSTM networks

C Su, J Wei, D Lin, L Kong, YL Guan - Pattern Analysis and Applications, 2024 - Springer
Three-dimensional convolutional neural networks (3D-CNNs) and full connection long short-
term memory networks (FC-LSTMs) have been demonstrated as a kind of powerful non …

Unsupervised prototype self-calibration based on hybrid attention contrastive learning for enhanced few-shot action recognition

Y An, Y Yi, L Wu, Y Cao, D Zhou, Y Yuan, B Liu… - Applied Soft …, 2025 - Elsevier
The collection and annotation of large-scale video data pose significant challenges,
prompting the exploration of few-shot models to recognize unseen actions with limited …

Self-Supervised 3-D Action Recognition by Contrasting Context-Enhanced Action Embeddings

K Ye, BN Zhao, S Liang, H Yao… - IEEE Transactions on …, 2025 - ieeexplore.ieee.org
3-D action recognition become a fast-pacing field in recent years. However, traditional
approaches have limitations. They either focus on modeling overly detailed yet redundant …

Self-Supervised Representation Learning with Spatial-Temporal Consistency for Sign Language Recognition

W Zhao, W Zhou, H Hu, M Wang, H Li - arxiv preprint arxiv:2406.10501, 2024 - arxiv.org
Recently, there have been efforts to improve the performance in sign language recognition
by designing self-supervised learning methods. However, these methods capture limited …

Spatiotemporal feature enhancement network for action recognition

G Huang, X Wang, X Li, Y Wang - Multimedia Tools and Applications, 2024 - Springer
As a hot topic in the field of computer vision, video action recognition has great application
potential, such as intelligent monitoring, data recommendation and virtual reality. However …

[PDF][PDF] Unifying Video Self-Supervised Learning across Families of Tasks: A Survey

I Dave, M Gunawardhana, L Sadith, H Zhou, L David… - Preprints, 2024 - preprints.org
Video self-supervised learning (VideoSSL) offers significant potential for reducing
annotation costs and enhancing a wide range of downstream tasks in video understanding …

CFI-Former: Cross-Feature Interaction Transformer for Group Activity Recognition

X Zhu, Y Zhou - Yan, CFI-Former: Cross-Feature Interaction …, 2023 - papers.ssrn.com
Group activity recognition (GAR) is a significant and challenging task that has attracted
considerable attention in video analysis. However, most of the existing models directly …

Unsupervised Prototype Self-Calibration Spatio-Temporal Attention Network for Enhanced Few-Shot Action Recognition

Y Yi, L Wu, Y Yuan, B Liu, Y Li, CY Su - papers.ssrn.com
The collection and annotation of large-scale video data pose significant challenges,
prompting the exploration of few-shot learning models capable of recognizing unseen …