A survey on deep learning-based spatio-temporal action detection

P Wang, F Zeng, Y Qian - International Journal of Wavelets …, 2024‏ - World Scientific
Spatio-temporal action detection (STAD) aims to classify the actions present in a video and
localize them in space and time. It has become a particularly active area of research in …

Multiscale vision transformers meet bipartite matching for efficient single-stage action localization

I Ntinou, E Sanchez… - Proceedings of the IEEE …, 2024‏ - openaccess.thecvf.com
Action Localization is a challenging problem that combines detection and recognition tasks
which are often addressed separately. State-of-the-art methods rely on off-the-shelf …

Yowov2: A stronger yet efficient multi-level detection framework for real-time spatio-temporal action detection

Z Jiang, J Yang, N Jiang, S Liu, T **e, L Zhao… - … Conference on Intelligent …, 2024‏ - Springer
Designing a real-time framework for the spatio-temporal action detection task is still a
challenge. In this paper, we propose a novel real-time action detection framework, YOWOv2 …

Classification Matters: Improving Video Action Detection with Class-Specific Attention

J Lee, T Kim, I Lee, M Shim, D Wee, M Cho… - European Conference on …, 2024‏ - Springer
Video action detection (VAD) aims to detect actors and classify their actions in a video. We
figure that VAD suffers more from classification rather than localization of actors. Hence, we …

TQRFormer: Tubelet query recollection transformer for action detection

X Wang, K Yang, Q Ding, R Wang, J Sun - Image and Vision Computing, 2024‏ - Elsevier
Spatial and temporal action detection aims to precisely locate actions while predicting their
respective categories. The existing solution, TubeR (Zhao et al., 2022), is designed to …

You watch once more: a more effective CNN architecture for video spatio-temporal action localization

Y Qin, L Chen, X Ben, M Yang - Multimedia Systems, 2024‏ - Springer
The task of spatio-temporal action localization (STAL) needs to detect the action and
position of individuals in the scene. Many works cannot model spatio-temporal information …

[HTML][HTML] A Real-Time Subway Driver Action Sensoring and Detection Based on Lightweight ShuffleNetV2 Network

X Shen, X Wei - Sensors, 2023‏ - mdpi.com
The driving operations of the subway system are of great significance in ensuring the safety
of trains. There are several hand actions defined in the driving instructions that the driver …

AdaFPP: Adapt-Focused Bi-Propagating Prototype Learning for Panoramic Activity Recognition

M Cao, R Yan, X Shu, G Dai, Y Yao… - Proceedings of the 32nd …, 2024‏ - dl.acm.org
Panoramic Activity Recognition (PAR) aims to identify multi-granul-arity behaviors performed
by multiple persons in panoramic scenes, including individual activities, group activities, and …

Cefdet: Cognitive Effectiveness Network Based on Fuzzy Inference for Action Detection

Z Luo, W Fu, S Liu, S Anwar, M Saqib… - Proceedings of the …, 2024‏ - dl.acm.org
Action detection and understanding provide the foundation for the generation and
interaction of multimedia content. However, existing methods mainly focus on constructing …

Stable Mean Teacher for Semi-supervised Video Action Detection

A Kumar, S Mitra, YS Rawat - arxiv preprint arxiv:2412.07072, 2024‏ - arxiv.org
In this work, we focus on semi-supervised learning for video action detection. Video action
detection requires spatiotemporal localization in addition to classification, and a limited …