Inducing causal structure for interpretable neural networks

A Geiger, Z Wu, H Lu, J Rozner… - International …, 2022 - proceedings.mlr.press
In many areas, we have well-founded insights about causal structure that would be useful to
bring into our trained models while still allowing them to learn in a data-driven fashion. To …

Can large language models be good path planners? a benchmark and investigation on spatial-temporal reasoning

M Aghzal, E Plaku, Z Yao - arxiv preprint arxiv:2310.03249, 2023 - arxiv.org
Large language models (LLMs) have achieved remarkable success across a wide spectrum
of tasks; however, they still face limitations in scenarios that demand long-term planning and …

Recogs: How incidental details of a logical form overshadow an evaluation of semantic interpretation

Z Wu, CD Manning, C Potts - Transactions of the Association for …, 2023 - direct.mit.edu
Compositional generalization benchmarks for semantic parsing seek to assess whether
models can accurately compute meanings for novel sentences, but operationalize this in …

A benchmark for compositional visual reasoning

A Zerroug, M Vaishnav, J Colin… - Advances in neural …, 2022 - proceedings.neurips.cc
A fundamental component of human vision is our ability to parse complex visual scenes and
judge the relations between their constituent objects. AI benchmarks for visual reasoning …

Llm-a*: Large language model enhanced incremental heuristic search on path planning

S Meng, Y Wang, CF Yang, N Peng… - arxiv preprint arxiv …, 2024 - arxiv.org
Path planning is a fundamental scientific problem in robotics and autonomous navigation,
requiring the derivation of efficient routes from starting to destination points while avoiding …

Skews in the phenomenon space hinder generalization in text-to-image generation

Y Chang, Y Zhang, Z Fang, YN Wu, Y Bisk… - European Conference on …, 2024 - Springer
The literature on text-to-image generation is plagued by issues of faithfully composing
entities with relations. But there lacks a formal understanding of how entity-relation …

Pushing the limits of rule reasoning in transformers through natural language satisfiability

K Richardson, A Sabharwal - Proceedings of the AAAI Conference on …, 2022 - ojs.aaai.org
Investigating the reasoning abilities of transformer models, and discovering new challenging
tasks for them, has been a topic of much interest. Recent studies have found these models to …

When can transformers ground and compose: Insights from compositional generalization benchmarks

A Sikarwar, A Patel, N Goyal - arxiv preprint arxiv:2210.12786, 2022 - arxiv.org
Humans can reason compositionally whilst grounding language utterances to the real world.
Recent benchmarks like ReaSCAN use navigation tasks grounded in a grid world to assess …

Relational reasoning and generalization using nonsymbolic neural networks.

A Geiger, A Carstensen, MC Frank, C Potts - Psychological Review, 2023 - psycnet.apa.org
The notion of equality (identity) is simple and ubiquitous, making it a key case study for
broader questions about the representations supporting abstract relational reasoning …

Imagine the unseen world: a benchmark for systematic generalization in visual world models

Y Kim, G Singh, J Park… - Advances in Neural …, 2023 - proceedings.neurips.cc
Systematic compositionality, or the ability to adapt to novel situations by creating a mental
model of the world using reusable pieces of knowledge, remains a significant challenge in …