Efficient test-time model adaptation without forgetting
Test-time adaptation provides an effective means of tackling the potential distribution shift
between model training and inference, by dynamically updating the model at test time. This …
between model training and inference, by dynamically updating the model at test time. This …
Learning to refactor action and co-occurrence features for temporal action localization
The main challenge of Temporal Action Localization is to retrieve subtle human actions from
various co-occurring ingredients, eg, context and background, in an untrimmed video. While …
various co-occurring ingredients, eg, context and background, in an untrimmed video. While …
Colar: Effective and efficient online action detection by consulting exemplars
L Yang, J Han, D Zhang - … of the IEEE/CVF conference on …, 2022 - openaccess.thecvf.com
Online action detection has attracted increasing research interests in recent years. Current
works model historical dependencies and anticipate the future to perceive the action …
works model historical dependencies and anticipate the future to perceive the action …
Video action segmentation via contextually refined temporal keypoints
Video action segmentation refers to the task of densely casting each video frame or short
segment in an untrimmed video into some pre-specified action categories. Although recent …
segment in an untrimmed video into some pre-specified action categories. Although recent …
Semantic and relation modulation for audio-visual event localization
We study the problem of localizing audio-visual events that are both audible and visible in a
video. Existing works focus on encoding and aligning audio and visual features at the …
video. Existing works focus on encoding and aligning audio and visual features at the …
Learning from noisy pseudo labels for semi-supervised temporal action localization
Abstract Semi-Supervised Temporal Action Localization (SS-TAL) aims to improve the
generalization ability of action detectors with large-scale unlabeled videos. Albeit the recent …
generalization ability of action detectors with large-scale unlabeled videos. Albeit the recent …
Temporal action localization in the deep learning era: A survey
The temporal action localization research aims to discover action instances from untrimmed
videos, representing a fundamental step in the field of intelligent video understanding. With …
videos, representing a fundamental step in the field of intelligent video understanding. With …
Compact representation and reliable classification learning for point-level weakly-supervised action localization
Point-level weakly-supervised temporal action localization (P-WSTAL) aims to localize
temporal extents of action instances and identify the corresponding categories with only a …
temporal extents of action instances and identify the corresponding categories with only a …
Videocot: A video chain-of-thought dataset with active annotation tool
Y Wang, Y Zeng, J Zheng, X **ng, J Xu, X Xu - arxiv preprint arxiv …, 2024 - arxiv.org
Multimodal large language models (MLLMs) are flourishing, but mainly focus on images with
less attention than videos, especially in sub-fields such as prompt engineering, video chain …
less attention than videos, especially in sub-fields such as prompt engineering, video chain …
Uncertainty guided collaborative training for weakly supervised and unsupervised temporal action localization
In weakly supervised (WSAL) and unsupervised temporal action localization (UAL), the
target is to simultaneously localize temporal boundaries and identify category labels of …
target is to simultaneously localize temporal boundaries and identify category labels of …