Capsule networks for computer vision applications: a comprehensive review

S Choudhary, S Saurav, R Saini, S Singh - Applied Intelligence, 2023 - Springer
Convolutional neural networks (CNNs) have achieved human-level performance in various
computer vision tasks, such as image classification, object detection & segmentation, etc …

Videocapsulenet: A simplified network for action detection

K Duarte, Y Rawat, M Shah - Advances in neural …, 2018 - proceedings.neurips.cc
The recent advances in Deep Convolutional Neural Networks (DCNNs) have shown
extremely good results for video human action classification, however, action detection is …

Deep learning-based hierarchical cattle behavior recognition with spatio-temporal information

A Fuentes, S Yoon, J Park, DS Park - Computers and Electronics in …, 2020 - Elsevier
Behavior is an important indicator for understanding the well-being of animals. This process
has been frequently carried out by observing video records to detect changes with statistical …

Recurrent tubelet proposal and recognition networks for action detection

D Li, Z Qiu, Q Dai, T Yao, T Mei - Proceedings of the …, 2018 - openaccess.thecvf.com
Detecting actions in videos is a challenging task as video is an information intensive media
with complex variations. Existing approaches predominantly generate action proposals for …

Dance with flow: Two-in-one stream action detection

J Zhao, CGM Snoek - … of the ieee/cvf conference on …, 2019 - openaccess.thecvf.com
The goal of this paper is to detect the spatio-temporal extent of an action. The two-stream
detection network based on RGB and flow provides state-of-the-art accuracy at the expense …

A survey on deep learning-based spatio-temporal action detection

P Wang, F Zeng, Y Qian - International Journal of Wavelets …, 2024 - World Scientific
Spatio-temporal action detection (STAD) aims to classify the actions present in a video and
localize them in space and time. It has become a particularly active area of research in …

Learning motion representation for real-time spatio-temporal action localization

D Zhang, L He, Z Tu, S Zhang, F Han, B Yang - Pattern Recognition, 2020 - Elsevier
The current deep learning based spatio-temporal action localization methods that using
motion information (predominated is optical flow) obtain the state-of-the-art performance …

Hierarchical self-attention network for action localization in videos

RRA Pramono, YT Chen… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
This paper presents a novel Hierarchical Self-Attention Network (HISAN) to generate spatial-
temporal tubes for action localization in videos. The essence of HISAN is to combine the two …

P3D-CTN: Pseudo-3D convolutional tube network for spatio-temporal action detection in videos

J Wei, H Wang, Y Yi, Q Li… - 2019 IEEE international …, 2019 - ieeexplore.ieee.org
The spatial independence and temporal continuity of video data as a whole are not fully
investigated for video action detection. To tackle this issue, a deep network architecture is …

Guess where? actor-supervision for spatiotemporal action localization

V Escorcia, CD Dao, M Jain, B Ghanem… - Computer Vision and …, 2020 - Elsevier
This paper addresses the problem of spatiotemporal localization of actions in videos.
Compared to leading approaches, which all learn to localize based on carefully annotated …