[HTML][HTML] Modern language models refute Chomsky's approach to language

ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com
Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …

The truth is in there: Improving reasoning in language models with layer-selective rank reduction

P Sharma, JT Ash, D Misra - arxiv preprint arxiv:2312.13558, 2023 - arxiv.org
Transformer-based Large Language Models (LLMs) have become a fixture in modern
machine learning. Correspondingly, significant resources are allocated towards research …

Grokking of hierarchical structure in vanilla transformers

S Murty, P Sharma, J Andreas, CD Manning - arxiv preprint arxiv …, 2023 - arxiv.org
For humans, language production and comprehension is sensitive to the hierarchical
structure of sentences. In natural language processing, past work has questioned how …

Language models as models of language

R Millière - arxiv preprint arxiv:2408.07144, 2024 - arxiv.org
This chapter critically examines the potential contributions of modern language models to
theoretical linguistics. Despite their focus on engineering goals, these models' ability to …

Learning syntax without planting trees: Understanding when and why transformers generalize hierarchically

K Ahuja, V Balachandran, M Panwar, T He… - arxiv preprint arxiv …, 2024 - arxiv.org
Transformers trained on natural language data have been shown to learn its hierarchical
structure and generalize to sentences with unseen syntactic structures without explicitly …

Pushdown layers: Encoding recursive structure in transformer language models

S Murty, P Sharma, J Andreas, CD Manning - arxiv preprint arxiv …, 2023 - arxiv.org
Recursion is a prominent feature of human language, and fundamentally challenging for self-
attention due to the lack of an explicit recursive-state tracking mechanism. Consequently …

How to plant trees in language models: Data and architectural effects on the emergence of syntactic inductive biases

A Mueller, T Linzen - arxiv preprint arxiv:2305.19905, 2023 - arxiv.org
Accurate syntactic representations are essential for robust generalization in natural
language. Recent work has found that pre-training can teach language models to rely on …

Injecting structural hints: Using language models to study inductive biases in language learning

I Papadimitriou, D Jurafsky - arxiv preprint arxiv:2304.13060, 2023 - arxiv.org
Both humans and large language models are able to learn language without explicit
structural supervision. What inductive biases make this learning possible? We address this …

What does the Knowledge Neuron Thesis Have to do with Knowledge?

J Niu, A Liu, Z Zhu, G Penn - arxiv preprint arxiv:2405.02421, 2024 - arxiv.org
We reassess the Knowledge Neuron (KN) Thesis: an interpretation of the mechanism
underlying the ability of large language models to recall facts from a training corpus. This …

Physics of language models: Part 1, learning hierarchical language structures

Z Allen-Zhu, Y Li - arxiv preprint arxiv:2305.13673, 2023 - arxiv.org
Transformer-based language models are effective but complex, and understanding their
inner workings is a significant challenge. Previous research has primarily explored how …