Going deeper into action recognition: A survey

S Herath, M Harandi, F Porikli - Image and vision computing, 2017 - Elsevier
Understanding human actions in visual data is tied to advances in complementary research
areas including object recognition, human dynamics, domain adaptation and semantic …

Actionvlad: Learning spatio-temporal aggregation for action classification

R Girdhar, D Ramanan, A Gupta… - Proceedings of the …, 2017 - openaccess.thecvf.com
In this work, we introduce a new video representation for action classification that
aggregates local convolutional features across the entire spatio-temporal extent of the video …

Videolstm convolves, attends and flows for action recognition

Z Li, K Gavrilyuk, E Gavves, M Jain… - Computer Vision and …, 2018 - Elsevier
We present VideoLSTM for end-to-end sequence learning of actions in video. Rather than
adapting the video to the peculiarities of established recurrent or convolutional architectures …

[HTML][HTML] Deep learning innovations in video classification: A survey on techniques and dataset evaluations

M Mao, A Lee, M Hong - Electronics, 2024 - mdpi.com
Video classification has achieved remarkable success in recent years, driven by advanced
deep learning models that automatically categorize video content. This paper provides a …

Learnable pooling with context gating for video classification

A Miech, I Laptev, J Sivic - arxiv preprint arxiv:1706.06905, 2017 - arxiv.org
Current methods for video analysis often extract frame-level features using pre-trained
convolutional neural networks (CNNs). Such features are then aggregated over time eg, by …

Action recognition with dynamic image networks

H Bilen, B Fernando, E Gavves… - IEEE transactions on …, 2017 - ieeexplore.ieee.org
We introduce the concept of dynamic image, a novel compact representation of videos
useful for video analysis, particularly in combination with convolutional neural networks …

Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection

M Zolfaghari, GL Oliveira… - Proceedings of the …, 2017 - openaccess.thecvf.com
General human action recognition requires understanding of various visual cues. In this
paper, we propose a network architecture that computes and integrates the most important …

Asynchronous temporal fields for action recognition

GA Sigurdsson, S Divvala… - Proceedings of the …, 2017 - openaccess.thecvf.com
Actions are more than just movements and trajectories: we cook to eat and we hold a cup to
drink from it. A thorough understanding of videos requires going beyond appearance …

Procedural generation of videos to train deep action recognition networks

C Roberto de Souza, A Gaidon… - Proceedings of the …, 2017 - openaccess.thecvf.com
Deep learning for human action recognition in videos is making significant progress, but is
slowed down by its dependency on expensive manual labeling of large video collections. In …

Deep unsupervised key frame extraction for efficient video classification

H Tang, L Ding, S Wu, B Ren, N Sebe… - ACM Transactions on …, 2023 - dl.acm.org
Video processing and analysis have become an urgent task, as a huge amount of videos
(eg, YouTube, Hulu) are uploaded online every day. The extraction of representative key …