Google 학술 검색

SK Sampat, M Patel, S Das, Y Yang, C Baral - arxiv preprint arxiv …, 2022 - arxiv.org

'Actions' play a vital role in how humans interact with the world and enable them to achieve
desired goals. As a result, most common sense (CS) knowledge for humans revolves …

저장 인용 11회 인용 관련 학술자료 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Video2commonsense: Generating commonsense descriptions to enrich video captioning

Z Fang, T Gokhale, P Banerjee, C Baral… - arxiv preprint arxiv …, 2020 - arxiv.org

Captioning is a crucial and challenging task for video understanding. In videos that involve
active agents such as humans, the agent's actions can bring about myriad changes in the …

저장 인용 73회 인용 관련 학술자료 전체 9개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Cripp-vqa: Counterfactual reasoning about implicit physical properties via video question answering

M Patel, T Gokhale, C Baral, Y Yang - arxiv preprint arxiv:2211.03779, 2022 - arxiv.org

Videos often capture objects, their visible properties, their motion, and the interactions
between different objects. Objects also have physical properties such as mass, which the …

저장 인용 14회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

Neural constraint satisfaction: Hierarchical abstraction for combinatorial generalization in object rearrangement

M Chang, AL Dayan, F Meier, TL Griffiths… - arxiv preprint arxiv …, 2023 - arxiv.org

Object rearrangement is a challenge for embodied agents because solving these tasks
requires generalizing across a combinatorially large set of configurations of entities and their …

저장 인용 4회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[PDF] openreview.net

Hierarchical abstraction for combinatorial generalization in object rearrangement

M Chang, AL Dayan, F Meier, TL Griffiths… - … 2022 Workshop on …, 2022 - openreview.net

Object rearrangement is a challenge for embodied agents because solving these tasks
requires generalizing across a combinatorially large set of underlying entities that take the …

저장 인용 6회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]

[PDF] arxiv.org

ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions

SK Sampat, Y Yang, C Baral - arxiv preprint arxiv:2410.13662, 2024 - arxiv.org

Humans observe various actions being performed by other humans (physically or in
videos/images) and can draw a wide range of inferences about it beyond what they can …

저장 인용 관련 학술자료 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Cooking with blocks: A recipe for visual reasoning on image-pairs

Reasoning about actions over visual and linguistic modalities: A survey

Video2commonsense: Generating commonsense descriptions to enrich video captioning

Cripp-vqa: Counterfactual reasoning about implicit physical properties via video question answering

Neural constraint satisfaction: Hierarchical abstraction for combinatorial generalization in object rearrangement

Hierarchical abstraction for combinatorial generalization in object rearrangement

ActionCOMET: A Zero-shot Approach to Learn Image-specific Commonsense Concepts about Actions