The best game in town: The reemergence of the language-of-thought hypothesis across the cognitive sciences

J Quilty-Dunn, N Porot, E Mandelbaum - Behavioral and Brain …, 2023 - cambridge.org
Mental representations remain the central posits of psychology after many decades of
scrutiny. However, there is no consensus about the representational format (s) of biological …

Dall-e-bot: Introducing web-scale diffusion models to robotics

I Kapelyukh, V Vosylius, E Johns - IEEE Robotics and …, 2023 - ieeexplore.ieee.org
We introduce the first work to explore web-scale diffusion models for robotics. DALL-E-Bot
enables a robot to rearrange objects in a scene, by first inferring a text description of those …

Consciousness in artificial intelligence: insights from the science of consciousness

P Butlin, R Long, E Elmoznino, Y Bengio… - arxiv preprint arxiv …, 2023 - arxiv.org
Whether current or near-term AI systems could be conscious is a topic of scientific interest
and increasing public concern. This report argues for, and exemplifies, a rigorous and …

Docci: Descriptions of connected and contrasting images

Y Onoe, S Rane, Z Berger, Y Bitton, J Cho… - … on Computer Vision, 2024 - Springer
Vision-language datasets are vital for both text-to-image (T2I) and image-to-text (I2T)
research. However, current datasets lack descriptions with fine-grained detail that would …

From word models to world models: Translating from natural language to the probabilistic language of thought

L Wong, G Grand, AK Lew, ND Goodman… - arxiv preprint arxiv …, 2023 - arxiv.org
How does language inform our downstream thinking? In particular, how do humans make
meaning from language--and how can we leverage a theory of linguistic meaning to build …

No" zero-shot" without exponential data: Pretraining concept frequency determines multimodal model performance

V Udandarao, A Prabhu, A Ghosh… - The Thirty-eighth …, 2024 - openreview.net
Web-crawled pretraining datasets underlie the impressive" zero-shot" evaluation
performance of multimodal models, such as CLIP for classification and Stable-Diffusion for …