Google Académico

E Vahdani, Y Tian - IEEE Transactions on Pattern Analysis and …, 2022 - ieeexplore.ieee.org

Understanding human behavior and activity facilitates advancement of numerous real-world
applications, and is critical for video analysis. Despite the progress of action recognition …

Guardar Citar Citado por 75 Artículos relacionados Las 8 versiones

[Free GPT-4]

[PDF] arxiv.org

Actionformer: Localizing moments of actions with transformers

CL Zhang, J Wu, Y Li - European Conference on Computer Vision, 2022 - Springer

Self-attention based Transformer models have demonstrated impressive results for image
classification and object detection, and more recently for video understanding. Inspired by …

Guardar Citar Citado por 422 Artículos relacionados Las 7 versiones

[Free GPT-4]

[PDF] thecvf.com

Univtg: Towards unified video-language temporal grounding

KQ Lin, P Zhang, J Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Abstract Video Temporal Grounding (VTG), which aims to ground target clips from videos
(such as consecutive intervals or disjoint shots) according to custom language queries (eg …

Guardar Citar Citado por 121 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Temporal sentence grounding in videos: A survey and future directions

H Zhang, A Sun, W **g, JT Zhou - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

Guardar Citar Citado por 54 Artículos relacionados Las 8 versiones

[Free GPT-4]

[PDF] thecvf.com

Video self-stitching graph network for temporal action localization

C Zhao, AK Thabet, B Ghanem - Proceedings of the IEEE …, 2021 - openaccess.thecvf.com

Temporal action localization (TAL) in videos is a challenging task, especially due to the
large variation in action temporal scales. Short actions usually occupy a major proportion in …

Guardar Citar Citado por 173 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]

[PDF] thecvf.com

An empirical study of end-to-end temporal action detection

X Liu, S Bai, X Bai - … of the IEEE/CVF Conference on …, 2022 - openaccess.thecvf.com

Temporal action detection (TAD) is an important yet challenging task in video
understanding. It aims to simultaneously predict the semantic label and the temporal interval …

Guardar Citar Citado por 73 Artículos relacionados Las 5 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Locvtp: Video-text pre-training for temporal localization

M Cao, T Yang, J Weng, C Zhang, J Wang… - European Conference on …, 2022 - Springer

Abstract Video-Text Pre-training (VTP) aims to learn transferable representations for various
downstream tasks from large-scale web videos. To date, almost all existing VTP methods …

Guardar Citar Citado por 71 Artículos relacionados Las 6 versiones

[Free GPT-4]

[PDF] arxiv.org

Zero-shot temporal action detection via vision-language prompting

S Nag, X Zhu, YZ Song, T **ang - European Conference on Computer …, 2022 - Springer

Existing temporal action detection (TAD) methods rely on large training data including
segment-level annotations, limited to recognizing previously seen classes alone during …

Guardar Citar Citado por 69 Artículos relacionados Las 8 versiones

[Free GPT-4]

[PDF] arxiv.org

Cross-modal consensus network for weakly supervised temporal action localization

FT Hong, JC Feng, D Xu, Y Shan… - Proceedings of the 29th …, 2021 - dl.acm.org

Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to
localize action instances in the given video with video-level categorical supervision …

Guardar Citar Citado por 93 Artículos relacionados Las 4 versiones

[Free GPT-4]

[PDF] arxiv.org

Proposal-free temporal action detection via global segmentation mask learning

S Nag, X Zhu, YZ Song, T **ang - European Conference on Computer …, 2022 - Springer

Existing temporal action detection (TAD) methods rely on generating an overwhelmingly
large number of proposals per video. This leads to complex model designs due to proposal …

Guardar Citar Citado por 50 Artículos relacionados Las 6 versiones

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Boundary-sensitive pre-training for temporal localization in videos

Deep learning-based action detection in untrimmed videos: A survey

Actionformer: Localizing moments of actions with transformers

Univtg: Towards unified video-language temporal grounding

Temporal sentence grounding in videos: A survey and future directions

Video self-stitching graph network for temporal action localization

An empirical study of end-to-end temporal action detection

Locvtp: Video-text pre-training for temporal localization

Zero-shot temporal action detection via vision-language prompting

Cross-modal consensus network for weakly supervised temporal action localization

Proposal-free temporal action detection via global segmentation mask learning