Gaussian temporal awareness networks for action localization

F Long, T Yao, Z Qiu, X Tian… - Proceedings of the …, 2019 - openaccess.thecvf.com
Temporally localizing actions in a video is a fundamental challenge in video understanding.
Most existing approaches have often drawn inspiration from image object detection and …

Tsp: Temporally-sensitive pretraining of video encoders for localization tasks

H Alwassel, S Giancola… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
Due to the large memory footprint of untrimmed videos, current state-of-the-art video
localization methods operate atop precomputed video clip features. These features are …

Weakly-supervised action localization with background modeling

PX Nguyen, D Ramanan… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com
We describe a latent approach that learns to detect actions in long sequences given training
videos with only whole-video class labels. Our approach makes use of two innovations to …

Structured multi-level interaction network for video moment localization via language query

H Wang, ZJ Zha, L Li, D Liu… - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com
We address the problem of localizing a specific moment described by a natural language
query. Existing works interact the query with either video frame or moment proposal, and …

Scale matters: Temporal scale aggregation network for precise action localization in untrimmed videos

G Gong, L Zheng, Y Mu - 2020 IEEE international conference …, 2020 - ieeexplore.ieee.org
Temporal action localization is a recently-emerging task, aiming to localize video segments
from untrimmed videos which contain specific actions. This work proposes a novel …

Uboco: Unsupervised boundary contrastive learning for generic event boundary detection

H Kang, J Kim, T Kim, SJ Kim - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Abstract Generic Event Boundary Detection (GEBD) is a newly suggested video
understanding task that aims to find one level deeper semantic boundaries of events …

Refineloc: Iterative refinement for weakly-supervised action localization

A Pardo, H Alwassel, F Caba… - Proceedings of the …, 2021 - openaccess.thecvf.com
Video action detectors are usually trained using datasets with fully-supervised temporal
annotations. Building such datasets is an expensive task. To alleviate this problem, recent …