- Academic Search

H Zhang, A Sun, W **g, JT Zhou - IEEE Transactions on …, 2023 - ieeexplore.ieee.org

Temporal sentence grounding in videos (TSGV), aka, natural language video localization
(NLVL) or video moment retrieval (VMR), aims to retrieve a temporal moment that …

Lưu Trích dẫn Trích dẫn 55 bài viết Bài viết có liên quan Tất cả 9 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Knowing where to focus: Event-aware transformer for video grounding

J Jang, J Park, J Kim, H Kwon… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Recent DETR-based video grounding models have made the model directly predict moment
timestamps without any hand-crafted components, such as a pre-defined proposal or non …

Lưu Trích dẫn Trích dẫn 52 bài viết Bài viết có liên quan Tất cả 9 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

You can ground earlier than see: An effective and efficient pipeline for temporal sentence grounding in compressed videos

X Fang, D Liu, P Zhou, G Nan - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

Given an untrimmed video, temporal sentence grounding (TSG) aims to locate a target
moment semantically according to a sentence query. Although previous respectable works …

Lưu Trích dẫn Trích dẫn 45 bài viết Bài viết có liên quan Tất cả 7 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Towards generalisable video moment retrieval: Visual-dynamic injection to image-text pre-training

D Luo, J Huang, S Gong, H **… - Proceedings of the IEEE …, 2023 - openaccess.thecvf.com

The correlation between the vision and text is essential for video moment retrieval (VMR),
however, existing methods heavily rely on separate pre-training feature extractors for visual …

Lưu Trích dẫn Trích dẫn 34 bài viết Bài viết có liên quan Tất cả 8 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] pkwyx.com

Rethinking weakly-supervised video temporal grounding from a game perspective

X Fang, Z **ong, W Fang, X Qu, C Chen, J Dong… - … on Computer Vision, 2024 - Springer

This paper addresses the challenging task of weakly-supervised video temporal grounding.
Existing approaches are generally based on the moment proposal selection framework that …

Lưu Trích dẫn Trích dẫn 10 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Fewer steps, better performance: Efficient cross-modal clip trimming for video moment retrieval using language

X Fang, D Liu, W Fang, P Zhou, Z Xu, W Xu… - Proceedings of the …, 2024 - ojs.aaai.org

Given an untrimmed video and a sentence query, video moment retrieval using language
(VMR) aims to locate a target query-relevant moment. Since the untrimmed video is …

Lưu Trích dẫn Trích dẫn 14 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hawkeye: Training video-text llms for grounding text in videos

Y Wang, X Meng, J Liang, Y Wang, Q Liu… - arxiv preprint arxiv …, 2024 - arxiv.org

Video-text Large Language Models (video-text LLMs) have shown remarkable performance
in answering questions and holding conversations on simple videos. However, they perform …

Lưu Trích dẫn Trích dẫn 23 bài viết Bài viết có liên quan Tất cả 3 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Autoeval-video: An automatic benchmark for assessing large vision language models in open-ended video question answering

X Chen, Y Lin, Y Zhang, W Huang - European Conference on Computer …, 2024 - Springer

We propose a novel and challenging benchmark, AutoEval-Video, to comprehensively
evaluate large vision-language models in open-ended video question answering. The …

Lưu Trích dẫn Trích dẫn 18 bài viết Bài viết có liên quan Tất cả 4 phiên bản

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Dual learning with dynamic knowledge distillation for partially relevant video retrieval

J Dong, M Zhang, Z Zhang, X Chen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Almost all previous text-to-video retrieval works assume that videos are pre-trimmed with
short durations. However, in practice, videos are generally untrimmed containing much …

Lưu Trích dẫn Trích dẫn 14 bài viết Bài viết có liên quan Tất cả 4 phiên bản Xem dạng HTML

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Partial annotation-based video moment retrieval via iterative learning

W Ji, R Liang, L Liao, H Fei, F Feng - Proceedings of the 31st ACM …, 2023 - dl.acm.org

Given a descriptive language query, Video Moment Retrieval (VMR) aims to seek the
corresponding semantic-consistent moment clip in the video, which is represented as a pair …

Lưu Trích dẫn Trích dẫn 19 bài viết Bài viết có liên quan Tất cả 5 phiên bản

Tạo thông báo

Trích dẫn

Tìm kiếm nâng cao

Đã lưu vào Thư viện của tôi

Reducing the vision and language bias for temporal sentence grounding

Temporal sentence grounding in videos: A survey and future directions

Knowing where to focus: Event-aware transformer for video grounding

You can ground earlier than see: An effective and efficient pipeline for temporal sentence grounding in compressed videos

Towards generalisable video moment retrieval: Visual-dynamic injection to image-text pre-training

Rethinking weakly-supervised video temporal grounding from a game perspective

Fewer steps, better performance: Efficient cross-modal clip trimming for video moment retrieval using language

Hawkeye: Training video-text llms for grounding text in videos

Autoeval-video: An automatic benchmark for assessing large vision language models in open-ended video question answering

Dual learning with dynamic knowledge distillation for partially relevant video retrieval

Partial annotation-based video moment retrieval via iterative learning