Eight things to know about large language models

SR Bowman - arxiv preprint arxiv:2304.00612, 2023‏ - arxiv.org
The widespread public deployment of large language models (LLMs) in recent months has
prompted a wave of new attention and engagement from advocates, policymakers, and …

Symbols and grounding in large language models

E Pavlick - … Transactions of the Royal Society A, 2023‏ - royalsocietypublishing.org
Large language models (LLMs) are one of the most impressive achievements of artificial
intelligence in recent years. However, their relevance to the study of language more broadly …

Challenging big-bench tasks and whether chain-of-thought can solve them

M Suzgun, N Scales, N Schärli, S Gehrmann… - arxiv preprint arxiv …, 2022‏ - arxiv.org
BIG-Bench (Srivastava et al., 2022) is a diverse evaluation suite that focuses on tasks
believed to be beyond the capabilities of current language models. Language models have …

Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks

Z Wu, L Qiu, A Ross, E Akyürek, B Chen… - Proceedings of the …, 2024‏ - aclanthology.org
The impressive performance of recent language models across a wide range of tasks
suggests that they possess a degree of abstract reasoning skills. Are these skills general …

Grounding large language models in interactive environments with online reinforcement learning

T Carta, C Romac, T Wolf, S Lamprier… - International …, 2023‏ - proceedings.mlr.press
Recent works successfully leveraged Large Language Models'(LLM) abilities to capture
abstract knowledge about world's physics to solve decision-making problems. Yet, the …

Language models represent space and time

W Gurnee, M Tegmark - arxiv preprint arxiv:2310.02207, 2023‏ - arxiv.org
The capabilities of large language models (LLMs) have sparked debate over whether such
systems just learn an enormous collection of superficial statistics or a set of more coherent …

Emergent world representations: Exploring a sequence model trained on a synthetic task

K Li, AK Hopkins, D Bau, F Viégas, H Pfister… - ICLR, 2023‏ - par.nsf.gov
Language models show a surprising range of capabilities, but the source of their apparent
competence is unclear. Do these networks just memorize a collection of surface statistics, or …

[PDF][PDF] The platonic representation hypothesis

M Huh, B Cheung, T Wang, P Isola - arxiv preprint arxiv:2405.07987, 2024‏ - r.jordan.im
We argue that representations in AI models, particularly deep networks, are converging.
First, we survey many examples of convergence in the literature: over time and across …

Is a picture worth a thousand words? delving into spatial reasoning for vision language models

J Wang, Y Ming, Z Shi, V Vineet… - Advances in Neural …, 2025‏ - proceedings.neurips.cc
Large language models (LLMs) and vision-language models (VLMs) have demonstrated
remarkable performance across a wide range of tasks and domains. Despite this promise …

From task structures to world models: what do LLMs know?

I Yildirim, LA Paul - Trends in Cognitive Sciences, 2024‏ - cell.com
In what sense does a large language model (LLM) have knowledge? We answer by
granting LLMs 'instrumental knowledge': knowledge gained by using next-word generation …