Going deeper into action recognition: A survey
Understanding human actions in visual data is tied to advances in complementary research
areas including object recognition, human dynamics, domain adaptation and semantic …
areas including object recognition, human dynamics, domain adaptation and semantic …
Actionvlad: Learning spatio-temporal aggregation for action classification
In this work, we introduce a new video representation for action classification that
aggregates local convolutional features across the entire spatio-temporal extent of the video …
aggregates local convolutional features across the entire spatio-temporal extent of the video …
Videolstm convolves, attends and flows for action recognition
We present VideoLSTM for end-to-end sequence learning of actions in video. Rather than
adapting the video to the peculiarities of established recurrent or convolutional architectures …
adapting the video to the peculiarities of established recurrent or convolutional architectures …
[HTML][HTML] Deep learning innovations in video classification: A survey on techniques and dataset evaluations
Video classification has achieved remarkable success in recent years, driven by advanced
deep learning models that automatically categorize video content. This paper provides a …
deep learning models that automatically categorize video content. This paper provides a …
Learnable pooling with context gating for video classification
Current methods for video analysis often extract frame-level features using pre-trained
convolutional neural networks (CNNs). Such features are then aggregated over time eg, by …
convolutional neural networks (CNNs). Such features are then aggregated over time eg, by …
Action recognition with dynamic image networks
We introduce the concept of dynamic image, a novel compact representation of videos
useful for video analysis, particularly in combination with convolutional neural networks …
useful for video analysis, particularly in combination with convolutional neural networks …
Chained multi-stream networks exploiting pose, motion, and appearance for action classification and detection
General human action recognition requires understanding of various visual cues. In this
paper, we propose a network architecture that computes and integrates the most important …
paper, we propose a network architecture that computes and integrates the most important …
Asynchronous temporal fields for action recognition
Actions are more than just movements and trajectories: we cook to eat and we hold a cup to
drink from it. A thorough understanding of videos requires going beyond appearance …
drink from it. A thorough understanding of videos requires going beyond appearance …
Procedural generation of videos to train deep action recognition networks
Deep learning for human action recognition in videos is making significant progress, but is
slowed down by its dependency on expensive manual labeling of large video collections. In …
slowed down by its dependency on expensive manual labeling of large video collections. In …
Deep unsupervised key frame extraction for efficient video classification
Video processing and analysis have become an urgent task, as a huge amount of videos
(eg, YouTube, Hulu) are uploaded online every day. The extraction of representative key …
(eg, YouTube, Hulu) are uploaded online every day. The extraction of representative key …