Tridet: Temporal action detection with relative boundary modeling

D Shi, Y Zhong, Q Cao, L Ma, J Li… - Proceedings of the …, 2023 - openaccess.thecvf.com
In this paper, we present a one-stage framework TriDet for temporal action detection.
Existing methods often suffer from imprecise boundary predictions due to the ambiguous …

Actionformer: Localizing moments of actions with transformers

CL Zhang, J Wu, Y Li - European Conference on Computer Vision, 2022 - Springer
Self-attention based Transformer models have demonstrated impressive results for image
classification and object detection, and more recently for video understanding. Inspired by …

Unloc: A unified framework for video localization tasks

S Yan, X **ong, A Nagrani, A Arnab… - Proceedings of the …, 2023 - openaccess.thecvf.com
While large-scale image-text pretrained models such as CLIP have been used for multiple
video-level tasks on trimmed videos, their use for temporal localization in untrimmed videos …

Vectorized evidential learning for weakly-supervised temporal action localization

J Gao, M Chen, C Xu - IEEE transactions on pattern analysis …, 2023 - ieeexplore.ieee.org
With the explosive growth of videos, weakly-supervised temporal action localization (WS-
TAL) task has become a promising research direction in pattern analysis and machine …

Proposal-based multiple instance learning for weakly-supervised temporal action localization

H Ren, W Yang, T Zhang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Weakly-supervised temporal action localization aims to localize and recognize actions in
untrimmed videos with only video-level category labels during training. Without instance …

Fine-grained temporal contrastive learning for weakly-supervised temporal action localization

J Gao, M Chen, C Xu - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
We target at the task of weakly-supervised action localization (WSAL), where only video-
level action labels are available during model training. Despite the recent progress, existing …

Dual-evidential learning for weakly-supervised temporal action localization

M Chen, J Gao, S Yang, C Xu - European conference on computer vision, 2022 - Springer
Weakly-supervised temporal action localization (WS-TAL) aims to localize the action
instances and recognize their categories with only video-level labels. Despite great …

React: Temporal action detection with relational queries

D Shi, Y Zhong, Q Cao, J Zhang, L Ma, J Li… - European conference on …, 2022 - Springer
This work aims at advancing temporal action detection (TAD) using an encoder-decoder
framework with action queries, similar to DETR, which has shown great success in object …

Dual detrs for multi-label temporal action detection

Y Zhu, G Zhang, J Tan, G Wu… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Abstract Temporal Action Detection (TAD) aims to identify the action boundaries and the
corresponding category within untrimmed videos. Inspired by the success of DETR in object …

Difftad: Temporal action detection with proposal denoising diffusion

S Nag, X Zhu, J Deng, YZ Song… - Proceedings of the …, 2023 - openaccess.thecvf.com
We propose a new formulation of temporal action detection (TAD) with denoising diffusion,
DiffTAD in short. Taking as input random temporal proposals, it can yield action proposals …