- Academic Search

R Zellers, Y Bisk, A Farhadi… - Proceedings of the IEEE …, 2019 - openaccess.thecvf.com

Visual understanding goes well beyond object recognition. With one glance at an image, we
can effortlessly imagine the world beyond the pixels: for instance, we can infer people's …

Enregistrer Citer Cité 1020 fois Autres articles Les 7 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] arxiv.org

Swag: A large-scale adversarial dataset for grounded commonsense inference

R Zellers, Y Bisk, R Schwartz, Y Choi - ar** down U-net for segmentation of biomedical images on platforms with low computational budgets

PK Gadosey, Y Li, EA Agyekum, T Zhang, Z Liu… - Diagnostics, 2020 - mdpi.com

During image segmentation tasks in computer vision, achieving high accuracy performance
while requiring fewer computations and faster inference is a big challenge. This is especially …

Enregistrer Citer Cité 100 fois Autres articles Les 10 versions Free GPT-4 En cache

[Free GPT-4]

[PDF] arxiv.org

Procedure planning in instructional videos

CY Chang, DA Huang, D Xu, E Adeli, L Fei-Fei… - … on Computer Vision, 2020 - Springer

In this paper, we study the problem of procedure planning in instructional videos, which can
be seen as a step towards enabling autonomous agents to plan for complex tasks in …

Enregistrer Citer Cité 105 fois Autres articles Les 5 versions Free GPT-4

[Free GPT-4]

[PDF] thecvf.com

Event-guided procedure planning from instructional videos with text supervision

AL Wang, KY Lin, JR Du, J Meng… - Proceedings of the …, 2023 - openaccess.thecvf.com

In this work, we focus on the task of procedure planning from instructional videos with text
supervision, where a model aims to predict an action sequence to transform the initial visual …

Enregistrer Citer Cité 14 fois Autres articles Les 5 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] neurips.cc

Egodistill: Egocentric head motion distillation for efficient video understanding

S Tan, T Nagarajan, K Grauman - Advances in Neural …, 2023 - proceedings.neurips.cc

Recent advances in egocentric video understanding models are promising, but their heavy
computational expense is a barrier for many real-world applications. To address this …

Enregistrer Citer Cité 20 fois Autres articles Les 6 versions Free GPT-4 Version HTML

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

From recognition to cognition: Visual commonsense reasoning

Swag: A large-scale adversarial dataset for grounded commonsense inference

Procedure planning in instructional videos

Event-guided procedure planning from instructional videos with text supervision

Egodistill: Egocentric head motion distillation for efficient video understanding