[HTML][HTML] Modern language models refute Chomsky's approach to language
ST Piantadosi - From fieldwork to linguistic theory: A tribute to …, 2023 - books.google.com
Modern machine learning has subverted and bypassed the theoretical framework of
Chomsky's generative approach to linguistics, including its core claims to particular insights …
Chomsky's generative approach to linguistics, including its core claims to particular insights …
The truth is in there: Improving reasoning in language models with layer-selective rank reduction
Transformer-based Large Language Models (LLMs) have become a fixture in modern
machine learning. Correspondingly, significant resources are allocated towards research …
machine learning. Correspondingly, significant resources are allocated towards research …
Grokking of hierarchical structure in vanilla transformers
For humans, language production and comprehension is sensitive to the hierarchical
structure of sentences. In natural language processing, past work has questioned how …
structure of sentences. In natural language processing, past work has questioned how …
Language models as models of language
R Millière - arxiv preprint arxiv:2408.07144, 2024 - arxiv.org
This chapter critically examines the potential contributions of modern language models to
theoretical linguistics. Despite their focus on engineering goals, these models' ability to …
theoretical linguistics. Despite their focus on engineering goals, these models' ability to …
Learning syntax without planting trees: Understanding when and why transformers generalize hierarchically
Transformers trained on natural language data have been shown to learn its hierarchical
structure and generalize to sentences with unseen syntactic structures without explicitly …
structure and generalize to sentences with unseen syntactic structures without explicitly …
Pushdown layers: Encoding recursive structure in transformer language models
Recursion is a prominent feature of human language, and fundamentally challenging for self-
attention due to the lack of an explicit recursive-state tracking mechanism. Consequently …
attention due to the lack of an explicit recursive-state tracking mechanism. Consequently …
How to plant trees in language models: Data and architectural effects on the emergence of syntactic inductive biases
Accurate syntactic representations are essential for robust generalization in natural
language. Recent work has found that pre-training can teach language models to rely on …
language. Recent work has found that pre-training can teach language models to rely on …
Injecting structural hints: Using language models to study inductive biases in language learning
Both humans and large language models are able to learn language without explicit
structural supervision. What inductive biases make this learning possible? We address this …
structural supervision. What inductive biases make this learning possible? We address this …
What does the Knowledge Neuron Thesis Have to do with Knowledge?
We reassess the Knowledge Neuron (KN) Thesis: an interpretation of the mechanism
underlying the ability of large language models to recall facts from a training corpus. This …
underlying the ability of large language models to recall facts from a training corpus. This …
Physics of language models: Part 1, learning hierarchical language structures
Z Allen-Zhu, Y Li - arxiv preprint arxiv:2305.13673, 2023 - arxiv.org
Transformer-based language models are effective but complex, and understanding their
inner workings is a significant challenge. Previous research has primarily explored how …
inner workings is a significant challenge. Previous research has primarily explored how …