Univtg: Towards unified video-language temporal grounding

KQ Lin, P Zhang, J Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com
Abstract Video Temporal Grounding (VTG), which aims to ground target clips from videos
(such as consecutive intervals or disjoint shots) according to custom language queries (eg …

Query-dependent video representation for moment retrieval and highlight detection

WJ Moon, S Hyun, SU Park, D Park… - Proceedings of the …, 2023 - openaccess.thecvf.com
Recently, video moment retrieval and highlight detection (MR/HD) are being spotlighted as
the demand for video understanding is drastically increased. The key objective of MR/HD is …

Bridging the gap: A unified video comprehension framework for moment retrieval and highlight detection

Y **ao, Z Luo, Y Liu, Y Ma, H Bian… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Video Moment Retrieval (MR) and Highlight Detection (HD) have attracted
significant attention due to the growing demand for video analysis. Recent approaches treat …

Umt: Unified multi-modal transformers for joint video moment retrieval and highlight detection

Y Liu, S Li, Y Wu, CW Chen… - Proceedings of the …, 2022 - openaccess.thecvf.com
Finding relevant moments and highlights in videos according to natural language queries is
a natural and highly valuable common need in the current video content explosion era …

Joint visual and audio learning for video highlight detection

T Badamdorj, M Rochan, Y Wang… - Proceedings of the …, 2021 - openaccess.thecvf.com
In video highlight detection, the goal is to identify the interesting moments within an unedited
video. Although the audio component of the video provides important cues for highlight …

-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Y Liu, J He, W Li, J Kim, D Wei, H Pfister… - European Conference on …, 2024 - Springer
Video temporal grounding (VTG) is a fine-grained video understanding problem that aims to
ground relevant clips in untrimmed videos given natural language queries. Most existing …

Correlation-guided query-dependency calibration in video representation learning for temporal grounding

WJ Moon, S Hyun, SB Lee, JP Heo - CoRR, 2023 - openreview.net
Temporal Grounding is to identify specific moments or highlights from a video corresponding
to textual descriptions. Typical approaches in temporal grounding treat all video clips …

Mh-detr: Video moment and highlight detection with cross-modal transformer

Y Xu, Y Sun, B Zhai, Y Jia, S Du - 2024 International Joint …, 2024 - ieeexplore.ieee.org
With the increasing demand for video understanding, video moment and highlight detection
(MHD) has emerged as a critical research topic. MHD aims to localize all moments and …

Contrastive learning for unsupervised video highlight detection

T Badamdorj, M Rochan, Y Wang… - Proceedings of the …, 2022 - openaccess.thecvf.com
Video highlight detection can greatly simplify video browsing, potentially paving the way for
a wide range of applications. Existing efforts are mostly fully-supervised, requiring humans …

Tr-detr: Task-reciprocal transformer for joint moment retrieval and highlight detection

H Sun, M Zhou, W Chen, W **e - … of the AAAI Conference on Artificial …, 2024 - ojs.aaai.org
Video moment retrieval (MR) and highlight detection (HD) based on natural language
queries are two highly related tasks, which aim to obtain relevant moments within videos …