A survey on video moment localization

M Liu, L Nie, Y Wang, M Wang, Y Rui - ACM Computing Surveys, 2023 - dl.acm.org
Video moment localization, also known as video moment retrieval, aims to search a target
segment within a video described by a given natural language query. Beyond the task of …

Temporal sentence grounding in videos: A survey and future directions

H Zhang, A Sun, W **g, JT Zhou - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

Weakly supervised temporal sentence grounding with gaussian-based contrastive proposal learning

M Zheng, Y Huang, Q Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Temporal sentence grounding aims to detect the most salient moment corresponding to the
natural language query from untrimmed videos. As labeling the temporal boundaries is labor …

You can ground earlier than see: An effective and efficient pipeline for temporal sentence grounding in compressed videos

X Fang, D Liu, P Zhou, G Nan - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Given an untrimmed video, temporal sentence grounding (TSG) aims to locate a target
moment semantically according to a sentence query. Although previous respectable works …

Proposal-based multiple instance learning for weakly-supervised temporal action localization

H Ren, W Yang, T Zhang… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com
Weakly-supervised temporal action localization aims to localize and recognize actions in
untrimmed videos with only video-level category labels during training. Without instance …

Uncertainty guided collaborative training for weakly supervised temporal action detection

W Yang, T Zhang, X Yu, T Qi… - Proceedings of the …, 2021 - openaccess.thecvf.com
Weakly supervised temporal action detection aims to localize temporal boundaries of
actions and identify their categories simultaneously with only video-level category labels …

Action unit memory network for weakly supervised temporal action localization

W Luo, T Zhang, W Yang, J Liu, T Mei… - Proceedings of the …, 2021 - openaccess.thecvf.com
Weakly supervised temporal action localization aims to detect and localize actions in
untrimmed videos with only video-level labels during training. However, without frame-level …

Rethinking weakly-supervised video temporal grounding from a game perspective

X Fang, Z **ong, W Fang, X Qu, C Chen, J Dong… - … on Computer Vision, 2024 - Springer
This paper addresses the challenging task of weakly-supervised video temporal grounding.
Existing approaches are generally based on the moment proposal selection framework that …

Weakly supervised video moment localization with contrastive negative sample mining

M Zheng, Y Huang, Q Chen, Y Liu - … of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org
Video moment localization aims at localizing the video segments which are most related to
the given free-form natural language query. The weakly supervised setting, where only …

Weakly supervised temporal sentence grounding with uncertainty-guided self-training

Y Huang, L Yang, Y Sato - … of the IEEE/CVF conference on …, 2023 - openaccess.thecvf.com
The task of weakly supervised temporal sentence grounding aims at finding the
corresponding temporal moments of a language description in the video, given video …