- Academic Search

Winner: Weakly-supervised hierarchical decomposition and alignment for spatio-temporal video grounding

M Li, H Wang, W Zhang, J Miao… - Proceedings of the …, 2023 - openaccess.thecvf.com

Spatio-temporal video grounding aims to localize the aligned visual tube corresponding to a
language query. Existing techniques achieve such alignment by exploiting dense boundary …

Salva Cita Citato da 36 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Are binary annotations sufficient? video moment retrieval via hierarchical uncertainty-based active learning

W Ji, R Liang, Z Zheng, W Zhang… - Proceedings of the …, 2023 - openaccess.thecvf.com

Recent research on video moment retrieval has mostly focused on enhancing the
performance of accuracy, efficiency, and robustness, all of which largely rely on the …

Salva Cita Citato da 36 Articoli correlati Tutte e 7 le versioni Versione HTML

[Free GPT-4]

[PDF] thecvf.com

Discovering spatio-temporal rationales for video question answering

Y Li, J ** visual instances to subjects/objects, and their relationships to …

Salva Cita Citato da 11 Articoli correlati

[Free GPT-4]

[PDF] arxiv.org

Mrtnet: Multi-resolution temporal network for video sentence grounding

W Ji, Y Qin, L Chen, Y Wei, Y Wu… - ICASSP 2024-2024 …, 2024 - ieeexplore.ieee.org

Video sentence grounding locates a specific moment in a video based on a text query.
Existing methods focus on single temporal resolution, ignoring multi-scale temporal …

Salva Cita Citato da 18 Articoli correlati Tutte e 3 le versioni

Lite-MKD: A Multi-modal Knowledge Distillation Framework for Lightweight Few-shot Action Recognition

B Liu, T Zheng, P Zheng, D Liu, X Qu, J Gao… - Proceedings of the 31st …, 2023 - dl.acm.org

Existing few-shot action recognition methods have placed primary focus on improving the
recognition accuracy while neglecting another important indicator in practical scenarios, ie …

Salva Cita Citato da 6 Articoli correlati

[Free GPT-4]

[PDF] google.com

Filling the Information Gap between Video and Query for Language-Driven Moment Retrieval

D Liu, X Qu, J Dong, G Nan, P Zhou, Z Xu… - Proceedings of the 31st …, 2023 - dl.acm.org

This paper addresses the challenging task of language-driven moment retrieval. Previous
methods are typically trained to localize the target moment corresponding to a single …

Salva Cita Citato da 8 Articoli correlati Tutte e 2 le versioni

[Free GPT-4]

[PDF] researchgate.net

Deep multimodal learning for information retrieval

W Ji, Y Wei, Z Zheng, H Fei, T Chua - Proceedings of the 31st ACM …, 2023 - dl.acm.org

Information retrieval (IR) is a fundamental technique that aims to acquire information from a
collection of documents, web pages, or other sources. While traditional text-based IR has …

Salva Cita Citato da 6 Articoli correlati Tutte e 3 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Vidvrd 2021: The third grand challenge on video relation detection

Winner: Weakly-supervised hierarchical decomposition and alignment for spatio-temporal video grounding

Are binary annotations sufficient? video moment retrieval via hierarchical uncertainty-based active learning

Discovering spatio-temporal rationales for video question answering

Mrtnet: Multi-resolution temporal network for video sentence grounding

Lite-MKD: A Multi-modal Knowledge Distillation Framework for Lightweight Few-shot Action Recognition

Filling the Information Gap between Video and Query for Language-Driven Moment Retrieval

Deep multimodal learning for information retrieval