- Academic Search

M Liu, L Nie, Y Wang, M Wang, Y Rui - ACM Computing Surveys, 2023 - dl.acm.org

Video moment localization, also known as video moment retrieval, aims to search a target
segment within a video described by a given natural language query. Beyond the task of …

Enregistrer Citer Cité 32 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Momentdiff: Generative video moment retrieval from random to real

P Li, CW **e, H **e, L Zhao, L Zhang… - Advances in neural …, 2024 - proceedings.neurips.cc

Video moment retrieval pursues an efficient and generalized solution to identify the specific
temporal segments within an untrimmed video that correspond to a given language …

Enregistrer Citer Cité 64 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Temporal sentence grounding in videos: A survey and future directions

H Zhang, A Sun, W **g, JT Zhou - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

Enregistrer Citer Cité 53 fois Autres articles Les 8 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Detecting moments and highlights in videos via natural language queries

J Lei, TL Berg, M Bansal - Advances in Neural Information …, 2021 - proceedings.neurips.cc

Detecting customized moments and highlights from videos given natural language (NL) user
queries is an important but under-studied topic. One of the challenges in pursuing this …

Enregistrer Citer Cité 254 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Cross-modal causal relational reasoning for event-level visual question answering

Y Liu, G Li, L Lin - IEEE Transactions on Pattern Analysis and …, 2023 - ieeexplore.ieee.org

Existing visual question answering methods often suffer from cross-modal spurious
correlations and oversimplified event-level reasoning processes that fail to capture event …

Enregistrer Citer Cité 119 fois Autres articles Les 7 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Unloc: A unified framework for video localization tasks

S Yan, X **ong, A Nagrani, A Arnab… - Proceedings of the …, 2023 - openaccess.thecvf.com

While large-scale image-text pretrained models such as CLIP have been used for multiple
video-level tasks on trimmed videos, their use for temporal localization in untrimmed videos …

Enregistrer Citer Cité 50 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

Vidchapters-7m: Video chapters at scale

A Yang, A Nagrani, I Laptev, J Sivic… - Advances in Neural …, 2024 - proceedings.neurips.cc

Segmenting untrimmed videos into chapters enables users to quickly navigate to the
information of their interest. This important topic has been understudied due to the lack of …

Enregistrer Citer Cité 28 fois Autres articles Les 19 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] acm.org

Deconfounded video moment retrieval with causal intervention

X Yang, F Feng, W Ji, M Wang, TS Chua - Proceedings of the 44th …, 2021 - dl.acm.org

We tackle the task of video moment retrieval (VMR), which aims to localize a specific
moment in a video according to a textual query. Existing methods primarily model the …

Enregistrer Citer Cité 195 fois Autres articles Les 3 versions Free GPT-4

[Free GPT-4]

[PDF] aaai.org

Negative sample matters: A renaissance of metric learning for temporal grounding

Z Wang, L Wang, T Wu, T Li, G Wu - … of the AAAI Conference on Artificial …, 2022 - ojs.aaai.org

Temporal grounding aims to localize a video moment which is semantically aligned with a
given natural language query. Existing methods typically apply a detection or regression …

Enregistrer Citer Cité 133 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] thecvf.com

Knowing where to focus: Event-aware transformer for video grounding

J Jang, J Park, J Kim, H Kwon… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Recent DETR-based video grounding models have made the model directly predict moment
timestamps without any hand-crafted components, such as a pre-defined proposal or non …

Enregistrer Citer Cité 51 fois Autres articles Les 8 versions Free GPT-4 Version HTML

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Interventional video grounding with dual contrastive learning

A survey on video moment localization

Momentdiff: Generative video moment retrieval from random to real

Temporal sentence grounding in videos: A survey and future directions

Detecting moments and highlights in videos via natural language queries

Cross-modal causal relational reasoning for event-level visual question answering

Unloc: A unified framework for video localization tasks

Vidchapters-7m: Video chapters at scale

Deconfounded video moment retrieval with causal intervention

Negative sample matters: A renaissance of metric learning for temporal grounding

Knowing where to focus: Event-aware transformer for video grounding