Deep learning-based action detection in untrimmed videos: A survey

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org
Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

Actionformer: Localizing moments of actions with transformers

CL Zhang, J Wu, Y Li - European Conference on Computer Vision, 2022 - Springer
Self-attention based Transformer models have demonstrated impressive results for image
classification and object detection, and more recently for video understanding. Inspired by …

Prompting visual-language models for efficient video understanding

C Ju, T Han, K Zheng, Y Zhang, W **e - European Conference on …, 2022 - Springer
Image-based visual-language (I-VL) pre-training has shown great success for learning joint
visual-textual representations from large-scale web data, revealing remarkable ability for …

Bmn: Boundary-matching network for temporal action proposal generation

T Lin, X Liu, X Li, E Ding, S Wen - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
Temporal action proposal generation is an challenging and promising task which aims to
locate temporal regions in real-world videos where action or event may occur. Current …

G-tad: Sub-graph localization for temporal action detection

M Xu, C Zhao, DS Rojas, A Thabet… - Proceedings of the …, 2020 - openaccess.thecvf.com
Temporal action detection is a fundamental yet challenging task in video understanding.
Video context is a critical cue to effectively detect actions, but current works mainly focus on …

Graph convolutional networks for temporal action localization

R Zeng, W Huang, M Tan, Y Rong… - Proceedings of the …, 2019 - openaccess.thecvf.com
Most state-of-the-art action localization systems process each action proposal individually,
without explicitly exploiting their relations during learning. However, the relations between …

Unloc: A unified framework for video localization tasks

S Yan, X **ong, A Nagrani, A Arnab… - Proceedings of the …, 2023 - openaccess.thecvf.com
While large-scale image-text pretrained models such as CLIP have been used for multiple
video-level tasks on trimmed videos, their use for temporal localization in untrimmed videos …

Relaxed transformer decoders for direct action proposal generation

J Tan, J Tang, L Wang, G Wu - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Temporal action proposal generation is an important and challenging task in video
understanding, which aims at detecting all temporal segments containing action instances of …

Bsn: Boundary sensitive network for temporal action proposal generation

T Lin, X Zhao, H Su, C Wang… - Proceedings of the …, 2018 - openaccess.thecvf.com
Temporal action proposal generation is an important yet challenging problem, since
temporal proposals with rich action content are indispensable for analysing real-world …

Rethinking the faster r-cnn architecture for temporal action localization

YW Chao, S Vijayanarasimhan… - Proceedings of the …, 2018 - openaccess.thecvf.com
We propose TAL-Net, an improved approach to temporal action localization in video that is
inspired by the Faster R-CNN object detection framework. TAL-Net addresses three key …