Rethinking the heatmap regression for bottom-up human pose estimation

Z Luo, Z Wang, Y Huang, L Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
Heatmap regression has become the most prevalent choice for nowadays human pose
estimation methods. The ground-truth heatmaps are usually constructed by covering all …

Enriching local and global contexts for temporal action localization

Z Zhu, W Tang, L Wang, N Zheng… - Proceedings of the …, 2021 - openaccess.thecvf.com
Effectively tackling the problem of temporal action localization (TAL) necessitates a visual
representation that jointly pursues two confounding goals, ie, fine-grained discrimination for …

Uncertainty-aware Action Decoupling Transformer for Action Anticipation

H Guo, N Agarwal, SY Lo, K Lee… - Proceedings of the IEEE …, 2024 - openaccess.thecvf.com
Human action anticipation aims at predicting what people will do in the future based on past
observations. In this paper we introduce Uncertainty-aware Action Decoupling Transformer …

Learning grounded vision-language representation for versatile understanding in untrimmed videos

T Wang, J Zhang, F Zheng, W Jiang, R Cheng… - arxiv preprint arxiv …, 2023 - arxiv.org
Joint video-language learning has received increasing attention in recent years. However,
existing works mainly focus on single or multiple trimmed video clips (events), which makes …

Astra: An action spotting transformer for soccer videos

A Xarles, S Escalera, TB Moeslund… - Proceedings of the 6th …, 2023 - dl.acm.org
In this paper, we introduce ASTRA, a Transformer-based model designed for the task of
Action Spotting in soccer matches. ASTRA addresses several challenges inherent in the …

ContextLoc++: A unified context model for temporal action localization

Z Zhu, L Wang, W Tang, N Zheng… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Effectively tackling the problem of temporal action localization (TAL) necessitates a visual
representation that jointly pursues two confounding goals, ie, fine-grained discrimination for …

Multi-dimensional attention with similarity constraint for weakly-supervised temporal action localization

Z Chen, H Liu, L Zhang, X Liao - IEEE Transactions on …, 2022 - ieeexplore.ieee.org
Weakly-supervised temporal action localization (WTAL) is a challenging task in
understanding untrimmed videos, in which no frame-wise annotation is provided during …

TVNet: Temporal voting network for action localization

H Wang, D Damen, M Mirmehdi, T Perrett - arxiv preprint arxiv …, 2022 - arxiv.org
We propose a Temporal Voting Network (TVNet) for action localization in untrimmed videos.
This incorporates a novel Voting Evidence Module to locate temporal boundaries, more …

Class‐wise boundary regression by uncertainty in temporal action detection

Y Chen, M Chen, Q Gu - IET Image Processing, 2022 - Wiley Online Library
Temporal action detection is a crucial aspect of video understanding. It aims to classify the
action as well as locate the start and end boundaries of the action in the untrimmed videos …

Distribution-aware Activity Boundary Representation for Online Detection of Action Start in Untrimmed Videos

X Hu, S Wang, M Li, Y Li, S Du - IEEE Signal Processing Letters, 2024 - ieeexplore.ieee.org
The Online Detection of Action Start (ODAS) has attracted the attention of researchers
because of its practical applications in areas such as security and emergency response …