Semantic matters: A constrained approach for zero-shot video action recognition

Z Quan, J Chen, D Deguchi, J Sun, C Zhang, Y Li… - Pattern Recognition, 2025 - Elsevier
Zero-shot video action recognition has advanced significantly due to the adaptation of visual-
language models, such as CLIP, to video domains. However, existing methods attempt to …

[HTML][HTML] Second-order transformer network for video recognition

B Zhang, W Dong, Z Wang, J Zhang, Q Sun - Alexandria Engineering …, 2025 - Elsevier
The video recognition community is undergoing a significant change in backbone shifting
from CNNs to transformers. However, due to the temporal information existing in the video …

Recognizing Video Activities in the Wild via View-to-Scene Joint Learning

J Yu, Y Chen, X Wang, X Cheng… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Recognizing video actions in the wild is challenging for visual control systems. In-the-wild
videos show actions not seen in training data, recorded from various angles and scenes with …

Domain-Separated Bottleneck Attention Fusion Framework for Multimodal Emotion Recognition

P He, J Yu, C Ge, Y Yu, W Xu, L Wang, T Liu… - ACM Transactions on …, 2024 - dl.acm.org
As a focal point of research in various fields, human body language understanding has long
been a subject of intense interest. Within this realm, the exploration of emotion recognition …

CHASE: Learning Convex Hull Adaptive Shift for Skeleton-based Multi-Entity Action Recognition

Y Wen, M Liu, S Wu, B Ding - arxiv preprint arxiv:2410.07153, 2024 - arxiv.org
Skeleton-based multi-entity action recognition is a challenging task aiming to identify
interactive actions or group activities involving multiple diverse entities. Existing models for …

[HTML][HTML] ABNet: AI-Empowered Abnormal Action Recognition Method for Laboratory Mouse Behavior

Y Chen, C Guo, Y Han, S Hao, J Song - Bioengineering, 2024 - mdpi.com
The automatic recognition and quantitative analysis of abnormal behavior in mice play a
crucial role in behavioral observation experiments in neuroscience, pharmacology, and …

SEA: State-Exchange Attention for High-Fidelity Physics Based Transformers

P Esmati, A Dadashzadeh, V Goodarzi… - arxiv preprint arxiv …, 2024 - arxiv.org
Current approaches using sequential networks have shown promise in estimating field
variables for dynamical systems, but they are often limited by high rollout errors. The …

Football Penalty Kick Prediction Model Based on Kicker's Pose Estimation

JA Mauricio Salazar, H Alatrista-Salas - Proceedings of the 2024 9th …, 2024 - dl.acm.org
This paper describes an innovative methodology for predicting penalty kicks in football
based on the pose estimation of the kicker. Our proposal starts with the construction of a …

STA+: Spatiotemporal Adaptation with Adaptive Model Selection for Video Action Recognition

M Li, C Zhang, X Zheng - 2024 IEEE 4th International …, 2024 - ieeexplore.ieee.org
Recent breakthroughs in video models have achieved remarkable success by integrating
vision transformers into the video domain through adaptation. However, prevalent …

[PDF][PDF] LanViKD: Cross-Modal Language-Vision Knowledge Distillation for Egocentric Action Recognition

Y Sun, H Li, CH Lin, R Batista-Navarro - 2024 - ceur-ws.org
Understanding human actions through the analysis of egocentric videos is a desirable
capability of intelligent agents, and is a research area that has gained popularity recently …