- Academic Search

P Mehrani, JK Tsotsos - Frontiers in Computer Science, 2023 - frontiersin.org

Recently, a considerable number of studies in computer vision involve deep neural
architectures called vision transformers. Visual processing in these models incorporates …

Save Cite Cited by 26 Related articles All 3 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] wiley.com

Compositionality in perception: A framework

KJ Lande - Wiley Interdisciplinary Reviews: Cognitive Science, 2024 - Wiley Online Library

Perception involves the processing of content or information about the world. In what form is
this content represented? I argue that perception is widely compositional. The perceptual …

Save Cite Cited by 3 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Imagine the unseen world: a benchmark for systematic generalization in visual world models

Y Kim, G Singh, J Park… - Advances in Neural …, 2024 - proceedings.neurips.cc

Systematic compositionality, or the ability to adapt to novel situations by creating a mental
model of the world using reusable pieces of knowledge, remains a significant challenge in …

Save Cite Cited by 1 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Does continual learning meet compositionality? new benchmarks and an evaluation framework

W Liao, Y Wei, M Jiang, Q Zhang… - Advances in Neural …, 2024 - proceedings.neurips.cc

Compositionality facilitates the comprehension of novel objects using acquired concepts
and the maintenance of a knowledge pool. This is particularly crucial for continual learners …

Save Cite Cited by 2 Related articles All 5 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Emergent communication for rules reasoning

Y Guo, Y Hao, R Zhang, E Zhou, Z Du… - Advances in …, 2024 - proceedings.neurips.cc

Research on emergent communication between deep-learning-based agents has received
extensive attention due to its inspiration for linguistics and artificial intelligence. However …

R-VQA: A robust visual question answering model

S Chowdhury, B Soni - Knowledge-Based Systems, 2025 - Elsevier

Abstract Visual Question Answering (VQA) involves generating answers to questions about
visual content, such as images. VQA models process an image and a question to produce …

Save Cite Related articles

[Free GPT-4]

[PDF] github.io

[PDF][PDF] Benchmarking Robustness of Text-Image Composed Retrieval

S Sun, J Gu, S Gong - arxiv preprint arxiv:2311.14837, 2023 - suntongtongtong.github.io

Text-image composed retrieval aims to retrieve the target image through the composed
query, which is specified in the form of an image plus some text that describes desired …

Save Cite Cited by 1 Related articles All 5 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

A benchmark for compositional visual reasoning

When and why vision-language models behave like bags-of-words, and what to do about it?

Compositionality in perception: A framework

Imagine the unseen world: a benchmark for systematic generalization in visual world models

Does continual learning meet compositionality? new benchmarks and an evaluation framework

Emergent communication for rules reasoning

R-VQA: A robust visual question answering model

[PDF][PDF] Benchmarking Robustness of Text-Image Composed Retrieval