TN-ZSTAD: Transferable network for zero-shot temporal activity detection

L Zhang, X Chang, J Liu, M Luo, Z Li… - … on Pattern Analysis …, 2022 - ieeexplore.ieee.org
An integral part of video analysis and surveillance is temporal activity detection, which
means to simultaneously recognize and localize activities in long untrimmed videos …

In the eye of beholder: Joint learning of gaze and actions in first person video

Y Li, M Liu, JM Rehg - Proceedings of the European …, 2018 - openaccess.thecvf.com
We address the task of jointly determining what a person is doing and where they are
looking based on the analysis of video captured by a headworn camera. We propose a …

Rt-gene: Real-time eye gaze estimation in natural environments

T Fischer, HJ Chang, Y Demiris - Proceedings of the …, 2018 - openaccess.thecvf.com
In this work, we consider the problem of robust gaze estimation in natural environments.
Large camera-to-subject distances and high variations in head pose and eye gaze angles …

Every moment counts: Dense detailed labeling of actions in complex videos

S Yeung, O Russakovsky, N **, M Andriluka… - International Journal of …, 2018 - Springer
Every moment counts in action recognition. A comprehensive understanding of human
activity in video requires labeling every frame according to the actions occurring, placing …

Class semantics-based attention for action detection

D Sridhar, N Quader, S Muralidharan… - Proceedings of the …, 2021 - openaccess.thecvf.com
Action localization networks are often structured as a feature encoder sub-network and a
localization sub-network, where the feature encoder learns to transform an input video to …

Action recognition in realistic sports videos

K Soomro, AR Zamir - Computer vision in sports, 2015 - Springer
The ability to analyze the actions which occur in a video is essential for automatic
understanding of sports. Action localization and recognition in videos are two main research …

[PDF][PDF] Detecting events and key actors in multi-person videos

V Ramanathan, J Huang, S Abu-El-Haija… - Proceedings of the …, 2016 - cv-foundation.org
Multi-person event recognition is a challenging task, often with many people active in the
scene but only a small subset contributing to an actual event. In this paper, we propose a …

Reinforcement learning for visual object detection

S Mathe, A Pirinen… - Proceedings of the IEEE …, 2016 - openaccess.thecvf.com
One of the most widely used strategies for visual object detection is based on exhaustive
spatial hypothesis search. While methods like sliding windows have been successful and …

Fast action proposals for human action detection and search

G Yu, J Yuan - Proceedings of the IEEE conference on …, 2015 - openaccess.thecvf.com
In this paper we target at generating generic action proposals in unconstrained videos. Each
action proposal corresponds to a temporal series of spatial bounding boxes, ie, a spatio …

Gaze-enabled egocentric video summarization via constrained submodular maximization

J Xu, L Mukherjee, Y Li, J Warner… - Proceedings of the …, 2015 - openaccess.thecvf.com
With the proliferation of wearable cameras, the number of videos of users documenting their
personal lives using such devices is rapidly increasing. Since such videos may span hours …