Pivotal: Prior-driven supervision for weakly-supervised temporal action localization
Abstract Weakly-supervised Temporal Action Localization (WTAL) attempts to localize the
actions in untrimmed videos using only video-level supervision. Most recent works approach …
actions in untrimmed videos using only video-level supervision. Most recent works approach …
Spact: Self-supervised privacy preservation for action recognition
Visual private information leakage is an emerging key issue for the fast growing applications
of video understanding like activity recognition. Existing approaches for mitigating privacy …
of video understanding like activity recognition. Existing approaches for mitigating privacy …
Online Action Detection in Surveillance Scenarios: A Comprehensive Review and Comparative Study of State-of-the-Art Multi-Object Tracking Methods
J Alikhanov, H Kim - IEEE Access, 2023 - ieeexplore.ieee.org
Online action detection in surveillance scenarios presents considerable challenges,
particularly due to the dynamically changing environments and real-time processing …
particularly due to the dynamically changing environments and real-time processing …
A survey on deep learning-based spatio-temporal action detection
Spatio-temporal action detection (STAD) aims to classify the actions present in a video and
localize them in space and time. It has become a particularly active area of research in …
localize them in space and time. It has become a particularly active area of research in …
Timebalance: Temporally-invariant and temporally-distinctive video representations for semi-supervised action recognition
Abstract Semi-Supervised Learning can be more beneficial for the video domain compared
to images because of its higher annotation cost and dimensionality. Besides, any video …
to images because of its higher annotation cost and dimensionality. Besides, any video …
On occlusions in video action detection: benchmark datasets and training recipes
This paper explores the impact of occlusions in video action detection. We facilitatethis study
by introducing five new benchmark datasets namely O-UCF and O-JHMDB consisting of …
by introducing five new benchmark datasets namely O-UCF and O-JHMDB consisting of …
Audio-visual glance network for efficient video recognition
Deep learning has made significant strides in video understanding tasks, but the
computation required to classify lengthy and massive videos using clip-level video …
computation required to classify lengthy and massive videos using clip-level video …
Transvisdrone: Spatio-temporal transformer for vision-based drone-to-drone detection in aerial videos
Drone-to-drone detection using visual feed has crucial applications, such as detecting drone
collisions, detecting drone attacks, or coordinating flight with other drones. However, existing …
collisions, detecting drone attacks, or coordinating flight with other drones. However, existing …
Spatio-temporal action detection under large motion
Current methods for spatiotemporal action tube detection often extend a bounding box
proposal at a given key-frame into a 3D temporal cuboid and pool features from nearby …
proposal at a given key-frame into a 3D temporal cuboid and pool features from nearby …
Sync from the sea: retrieving alignable videos from large-scale datasets
Temporal video alignment aims to synchronize the key events like object interactions or
action phase transitions in two videos. Such methods could benefit various video editing …
action phase transitions in two videos. Such methods could benefit various video editing …