Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Temporal sentence grounding in videos: A survey and future directions
Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …
A survey on video moment localization
Video moment localization, also known as video moment retrieval, aims to search a target
segment within a video described by a given natural language query. Beyond the task of …
segment within a video described by a given natural language query. Beyond the task of …
Deconfounded video moment retrieval with causal intervention
We tackle the task of video moment retrieval (VMR), which aims to localize a specific
moment in a video according to a textual query. Existing methods primarily model the …
moment in a video according to a textual query. Existing methods primarily model the …
Weakly supervised temporal sentence grounding with gaussian-based contrastive proposal learning
Temporal sentence grounding aims to detect the most salient moment corresponding to the
natural language query from untrimmed videos. As labeling the temporal boundaries is labor …
natural language query from untrimmed videos. As labeling the temporal boundaries is labor …
G2l: Semantically aligned and uniform video grounding via geodesic and game theory
The recent video grounding works attempt to introduce vanilla contrastive learning into video
grounding. However, we claim that this naive solution is suboptimal. Contrastive learning …
grounding. However, we claim that this naive solution is suboptimal. Contrastive learning …
Fast video moment retrieval
This paper targets at fast video moment retrieval (fast VMR), aiming to localize the target
moment efficiently and accurately as queried by a given natural language sentence. We …
moment efficiently and accurately as queried by a given natural language sentence. We …
Winner: Weakly-supervised hierarchical decomposition and alignment for spatio-temporal video grounding
Spatio-temporal video grounding aims to localize the aligned visual tube corresponding to a
language query. Existing techniques achieve such alignment by exploiting dense boundary …
language query. Existing techniques achieve such alignment by exploiting dense boundary …
Weakly supervised video moment localization with contrastive negative sample mining
Video moment localization aims at localizing the video segments which are most related to
the given free-form natural language query. The weakly supervised setting, where only …
the given free-form natural language query. The weakly supervised setting, where only …
Structured multi-level interaction network for video moment localization via language query
We address the problem of localizing a specific moment described by a natural language
query. Existing works interact the query with either video frame or moment proposal, and …
query. Existing works interact the query with either video frame or moment proposal, and …
Multi-stage aggregated transformer network for temporal language localization in videos
We address the problem of localizing a specific moment from an untrimmed video by a
language sentence query. Generally, previous methods mainly exist two problems that are …
language sentence query. Generally, previous methods mainly exist two problems that are …