Vision transformers for action recognition: A survey

A Ulhaq, N Akhtar, G Pogrebna, A Mian - arxiv preprint arxiv:2209.05700, 2022 - arxiv.org
Vision transformers are emerging as a powerful tool to solve computer vision problems.
Recent techniques have also proven the efficacy of transformers beyond the image domain …

Tridet: Temporal action detection with relative boundary modeling

D Shi, Y Zhong, Q Cao, L Ma, J Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we present a one-stage framework TriDet for temporal action detection.
Existing methods often suffer from imprecise boundary predictions due to the ambiguous …

Proposal-based multiple instance learning for weakly-supervised temporal action localization

H Ren, W Yang, T Zhang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Weakly-supervised temporal action localization aims to localize and recognize actions in
untrimmed videos with only video-level category labels during training. Without instance …

Stmixer: A one-stage sparse action detector

T Wu, M Cao, Z Gao, G Wu… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Traditional video action detectors typically adopt the two-stage pipeline, where a person
detector is first employed to yield actor boxes and then 3D RoIAlign is used to extract actor …

Dyfadet: Dynamic feature aggregation for temporal action detection

L Yang, Z Zheng, Y Han, H Cheng, S Song… - … on Computer Vision, 2024 - Springer
Recent proposed neural network-based Temporal Action Detection (TAD) models are
inherently limited to extracting the discriminative representations and modeling action …

Action detection via an image diffusion process

LG Foo, T Li, H Rahmani, J Liu - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Action detection aims to localize the starting and ending points of action instances in
untrimmed videos and predict the classes of those instances. In this paper we make the …

Movement enhancement toward multi-scale video feature representation for temporal action detection

Z Zhao, D Wang, X Zhao - Proceedings of the IEEE/CVF …, 2023 - openaccess.thecvf.com
Boundary localization is a challenging problem in Temporal Action Detection (TAD), in
which there are two main issues. First, the submergence of movement feature, ie the …

Adapting short-term transformers for action detection in untrimmed videos

M Yang, H Gao, P Guo, L Wang - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Vision Transformer (ViT) has shown high potential in video recognition owing to its
flexible design adaptable self-attention mechanisms and the efficacy of masked pre-training …

TransVFS: A spatio-temporal local–global transformer for vision-based force sensing during ultrasound-guided prostate biopsy

Y Wang, Z Ye, M Wen, H Liang, X Zhang - Medical image analysis, 2024 - Elsevier
Robot-assisted prostate biopsy is a new technology to diagnose prostate cancer, but its
safety is influenced by the inability of robots to sense the tool-tissue interaction force …

LGAFormer: transformer with local and global attention for action detection

H Zhang, F Zhou, D Wang, X Zhang, D Yu… - The Journal of …, 2024 - Springer
Temporal action detection is a very important task in video understanding, aiming at
predicting the start and end time boundaries of all action instances in an unedited video and …