- Academic Search

R Geirhos, JH Jacobsen, C Michaelis… - Nature Machine …, 2020 - nature.com

Deep learning has triggered the current rise of artificial intelligence and is the workhorse of
today's machine intelligence. Numerous success stories have rapidly spread all over …

Save Cite Cited by 2206 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A simple llm framework for long-range video question-answering

C Zhang, T Lu, MM Islam, Z Wang, S Yu… - arxiv preprint arxiv …, 2023 - arxiv.org

We present LLoVi, a language-based framework for long-range video question-answering
(LVQA). Unlike prior long-range video understanding methods, which are often costly and …

Save Cite Cited by 59 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Condensed movies: Story based retrieval with contextual embeddings

M Bain, A Nagrani, A Brown… - Proceedings of the …, 2020 - openaccess.thecvf.com

Our objective in this work is the long range understandingof the narrative structure of
movies. Instead of considering the entire movie, we propose to learn from thekey scenes' of …

Save Cite Cited by 112 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

AMEGO: Active Memory from long EGOcentric videos

G Goletto, T Nagarajan, G Averta, D Damen - European Conference on …, 2024 - Springer

Egocentric videos provide a unique perspective into individuals' daily experiences, yet their
unstructured nature presents challenges for perception. In this paper, we introduce AMEGO …

Save Cite Cited by 2 Related articles All 12 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Learning to cut by watching movies

A Pardo, F Caba, JL Alcázar… - Proceedings of the …, 2021 - openaccess.thecvf.com

Video content creation keeps growing at an incredible pace; yet, creating engaging stories
remains challenging and requires non-trivial video editing expertise. Many video editing …

Save Cite Cited by 24 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Grounded multi-hop videoqa in long-form egocentric videos

Q Chen, S Di, W **e - arxiv preprint arxiv:2408.14469, 2024 - arxiv.org

This paper considers the problem of Multi-Hop Video Question Answering (MH-VidQA) in
long-form egocentric videos. This task not only requires to answer visual questions, but also …

Save Cite Cited by 3 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

HLVU: A new challenge to test deep understanding of movies the way humans do

K Curtis, G Awad, S Rajput, I Soboroff - Proceedings of the 2020 …, 2020 - dl.acm.org

In this paper we propose a new evaluation challenge and direction in the area of High-level
Video Understanding. The challenge we are proposing is designed to test automatic video …

Save Cite Cited by 35 Related articles All 8 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Long Story Short: a Summarize-then-Search Method for Long Video Question Answering

J Chung, Y Yu - arxiv preprint arxiv:2311.01233, 2023 - arxiv.org

Large language models such as GPT-3 have demonstrated an impressive capability to
adapt to new tasks without requiring task-specific training data. This capability has been …

Save Cite Cited by 3 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Situation and behavior understanding by trope detection on films

CH Chang, HT Su, JH Hsu, YS Wang… - Proceedings of the Web …, 2021 - dl.acm.org

The human ability of deep cognitive skills is crucial for the development of various real-world
applications that process diverse and abundant user generated input. While recent progress …

Save Cite Cited by 15 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses

C Tan, Z Lin, J Pu, Z Qi, WY Pei, Z Qu, Y Wang… - Proceedings of the …, 2024 - dl.acm.org

Video grounding is a fundamental problem in multimodal content understanding, aiming to
localize specific natural language queries in an untrimmed video. However, current video …

Create alert

Cite

Advanced search

Saved to My library

Are we asking the right questions in MovieQA?

Shortcut learning in deep neural networks

A simple llm framework for long-range video question-answering

Condensed movies: Story based retrieval with contextual embeddings

AMEGO: Active Memory from long EGOcentric videos

Learning to cut by watching movies

Grounded multi-hop videoqa in long-form egocentric videos

HLVU: A new challenge to test deep understanding of movies the way humans do

Long Story Short: a Summarize-then-Search Method for Long Video Question Answering

Situation and behavior understanding by trope detection on films

SynopGround: A Large-Scale Dataset for Multi-Paragraph Video Grounding from TV Dramas and Synopses