Google Tudós

A Rogers, M Gardner, I Augenstein - ACM Computing Surveys, 2023 - dl.acm.org

Alongside huge volumes of research on deep learning models in NLP in the recent years,
there has been much work on benchmark datasets needed to track modeling progress …

Mentés Hivatkozás Idézetek száma: 230 Kapcsolódó cikkek Mind a(z) 7 változat

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Video generative adversarial networks: a review

N Aldausari, A Sowmya, N Marcus… - ACM Computing Surveys …, 2022 - dl.acm.org

With the increasing interest in the content creation field in multiple sectors such as media,
education, and entertainment, there is an increased trend in the papers that use AI …

Mentés Hivatkozás Idézetek száma: 146 Kapcsolódó cikkek Mind a(z) 7 változat

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Zero-shot video question answering via frozen bidirectional language models

A Yang, A Miech, J Sivic, I Laptev… - Advances in Neural …, 2022 - proceedings.neurips.cc

Video question answering (VideoQA) is a complex task that requires diverse multi-modal
data for training. Manual annotation of question and answers for videos, however, is tedious …

Mentés Hivatkozás Idézetek száma: 230 Kapcsolódó cikkek Mind a(z) 11 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Just ask: Learning to answer questions from millions of narrated videos

A Yang, A Miech, J Sivic, I Laptev… - Proceedings of the …, 2021 - openaccess.thecvf.com

Recent methods for visual question answering rely on large-scale annotated datasets.
Manual annotation of questions and answers for videos, however, is tedious, expensive and …

Mentés Hivatkozás Idézetek száma: 319 Kapcsolódó cikkek Mind a(z) 14 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Video understanding with large language models: A survey

Y Tang, J Bi, S Xu, L Song, S Liang, T Wang… - arxiv preprint arxiv …, 2023 - arxiv.org

With the burgeoning growth of online video platforms and the escalating volume of video
content, the demand for proficient video understanding tools has intensified markedly. Given …

Mentés Hivatkozás Idézetek száma: 60 Kapcsolódó cikkek Mind a(z) 3 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tvqa: Localized, compositional video question answering

J Lei, L Yu, M Bansal, TL Berg - arxiv preprint arxiv:1809.01696, 2018 - arxiv.org

Recent years have witnessed an increasing interest in image-based question-answering
(QA) tasks. However, due to data limitations, there has been much less work on video-based …

Mentés Hivatkozás Idézetek száma: 719 Kapcsolódó cikkek Mind a(z) 6 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Hierarchical conditional relation networks for video question answering

TM Le, V Le, S Venkatesh… - Proceedings of the IEEE …, 2020 - openaccess.thecvf.com

Video question answering (VideoQA) is challenging as it requires modeling capacity to distill
dynamic visual artifacts and distant relations and to associate them with linguistic concepts …

Mentés Hivatkozás Idézetek száma: 321 Kapcsolódó cikkek Mind a(z) 11 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

[PDF][PDF] Activitynet-qa: A dataset for understanding complex web videos via question answering

Z Yu, D Xu, J Yu, T Yu, Z Zhao, Y Zhuang… - Proceedings of the AAAI …, 2019 - aaai.org

Recent developments in modeling language and vision have been successfully applied to
image question answering. It is both crucial and natural to extend this research direction to …

Mentés Hivatkozás Idézetek száma: 410 Kapcsolódó cikkek Mind a(z) 10 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Videodirectorgpt: Consistent multi-scene video generation via llm-guided planning

H Lin, A Zala, J Cho, M Bansal - arxiv preprint arxiv:2309.15091, 2023 - arxiv.org

Recent text-to-video (T2V) generation methods have seen significant advancements.
However, the majority of these works focus on producing short video clips of a single event …

Mentés Hivatkozás Idézetek száma: 59 Kapcsolódó cikkek Mind a(z) 5 változat HTML-változat

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Tvr: A large-scale dataset for video-subtitle moment retrieval

J Lei, L Yu, TL Berg, M Bansal - … Conference, Glasgow, UK, August 23–28 …, 2020 - Springer

We introduce TV show Retrieval (TVR), a new multimodal retrieval dataset. TVR requires
systems to understand both videos and their associated subtitle (dialogue) texts, making it …

Mentés Hivatkozás Idézetek száma: 301 Kapcsolódó cikkek Mind a(z) 7 változat

Értesítés létrehozása

Hivatkozás

Speciális keresés

Mentve a Saját könyvtárba

Deepstory: Video story qa by deep embedded memory networks

Qa dataset explosion: A taxonomy of nlp resources for question answering and reading comprehension

Video generative adversarial networks: a review

Zero-shot video question answering via frozen bidirectional language models

Just ask: Learning to answer questions from millions of narrated videos

Video understanding with large language models: A survey

Tvqa: Localized, compositional video question answering

Hierarchical conditional relation networks for video question answering

[PDF][PDF] Activitynet-qa: A dataset for understanding complex web videos via question answering

Videodirectorgpt: Consistent multi-scene video generation via llm-guided planning

Tvr: A large-scale dataset for video-subtitle moment retrieval