Študovňa Google

K Sanders, B Van Durme - … of the IEEE/CVF Conference on …, 2024 - openaccess.thecvf.com

While existing video benchmarks largely consider specialized downstream tasks like
retrieval or question-answering (QA) contemporary multimodal AI systems must be capable …

Uložiť Citovať Citované 3-krát Súvisiace články Všetky verzie 6 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Employing Glyphic Information for Chinese Event Extraction with Vision-Language Model

X Bao, J Gu, Z Wang, M Qiang… - Findings of the …, 2024 - aclanthology.org

As a complex task that requires rich information input, features from various aspects have
been utilized in event extraction. However, most of the previous works ignored the value of …

Uložiť Citovať Súvisiace články Všetky verzie 3 HTML verzia

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

[PDF][PDF] MUMOSA, Interactive Dashboard for MUlti-MOdal Situation Awareness

S Lukin, S Bowser, R Suchocki… - Proceedings of the …, 2024 - aclanthology.org

Abstract Information extraction has led the way for event detection from text for many years.
Recent advances in neural models, such as Large Language Models (LLMs) and Vision …

Uložiť Citovať Súvisiace články Všetky verzie 2 HTML verzia

Vytvoriť upozornenie

Citovať

Rozšírené vyhľadávanie

Uložené do mojej knižnice

Resin-editor: A schema-guided hierarchical event graph visualizer and editor

A survey of video datasets for grounded event understanding

Employing Glyphic Information for Chinese Event Extraction with Vision-Language Model

[PDF][PDF] MUMOSA, Interactive Dashboard for MUlti-MOdal Situation Awareness