Vision transformers for action recognition: A survey

A Ulhaq, N Akhtar, G Pogrebna, A Mian - arxiv preprint arxiv:2209.05700, 2022 - arxiv.org
Vision transformers are emerging as a powerful tool to solve computer vision problems.
Recent techniques have also proven the efficacy of transformers beyond the image domain …

Deep learning-based action detection in untrimmed videos: A survey

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org
Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

Star-transformer: a spatio-temporal cross attention transformer for human action recognition

D Ahn, S Kim, H Hong, BC Ko - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
In action recognition, although the combination of spatio-temporal videos and skeleton
features can improve the recognition performance, a separate model and balancing feature …

Videollm: Modeling video sequence with large language models

G Chen, YD Zheng, J Wang, J Xu, Y Huang… - arxiv preprint arxiv …, 2023 - arxiv.org
With the exponential growth of video data, there is an urgent need for automated technology
to analyze and comprehend video content. However, existing video understanding models …

Video transformers: A survey

J Selva, AS Johansen, S Escalera… - … on Pattern Analysis …, 2023 - ieeexplore.ieee.org
Transformer models have shown great success handling long-range interactions, making
them a promising tool for modeling video. However, they lack inductive biases and scale …

Hybrid relation guided set matching for few-shot action recognition

X Wang, S Zhang, Z Qing, M Tang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Current few-shot action recognition methods reach impressive performance by learning
discriminative features for each video via episodic training and designing various temporal …

Molo: Motion-augmented long-short contrastive learning for few-shot action recognition

X Wang, S Zhang, Z Qing, C Gao… - Proceedings of the …, 2023 - openaccess.thecvf.com
Current state-of-the-art approaches for few-shot action recognition achieve promising
performance by conducting frame-level matching on learned visual features. However, they …

Flow-guided transformer for video inpainting

K Zhang, J Fu, D Liu - European Conference on Computer Vision, 2022 - Springer
We propose a flow-guided transformer, which innovatively leverage the motion discrepancy
exposed by optical flows to instruct the attention retrieval in transformer for high fidelity video …

Real-time online video detection with temporal smoothing transformers

Y Zhao, P Krähenbühl - European Conference on Computer Vision, 2022 - Springer
Streaming video recognition reasons about objects and their actions in every frame of a
video. A good streaming recognition model captures both long-term dynamics and short …

Progress-aware online action segmentation for egocentric procedural task videos

Y Shen, E Elhamifar - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com
We address the problem of online action segmentation for egocentric procedural task
videos. While previous studies have mostly focused on offline action segmentation where …