Probing classifiers: Promises, shortcomings, and advances
Y Belinkov - Computational Linguistics, 2022 - direct.mit.edu
Probing classifiers have emerged as one of the prominent methodologies for interpreting
and analyzing deep neural network models of natural language processing. The basic idea …
and analyzing deep neural network models of natural language processing. The basic idea …
Masked language modeling and the distributional hypothesis: Order word matters pre-training for little
A possible explanation for the impressive performance of masked language model (MLM)
pre-training is that such models have learned to represent the syntactic structures prevalent …
pre-training is that such models have learned to represent the syntactic structures prevalent …
Implicit representations of meaning in neural language models
Does the effectiveness of neural language models derive entirely from accurate modeling of
surface word co-occurrence statistics, or do these models represent and reason about the …
surface word co-occurrence statistics, or do these models represent and reason about the …
When do you need billions of words of pretraining data?
NLP is currently dominated by general-purpose pretrained language models like RoBERTa,
which achieve strong performance on NLU tasks through pretraining on billions of words …
which achieve strong performance on NLU tasks through pretraining on billions of words …
Schrödinger's tree—On syntax and neural language models
In the last half-decade, the field of natural language processing (NLP) has undergone two
major transitions: the switch to neural networks as the primary modeling paradigm and the …
major transitions: the switch to neural networks as the primary modeling paradigm and the …
Can language models encode perceptual structure without grounding? a case study in color
Pretrained language models have been shown to encode relational information, such as the
relations between entities or concepts in knowledge-bases--(Paris, Capital, France) …
relations between entities or concepts in knowledge-bases--(Paris, Capital, France) …
Sudden drops in the loss: Syntax acquisition, phase transitions, and simplicity bias in MLMs
Most interpretability research in NLP focuses on understanding the behavior and features of
a fully trained model. However, certain insights into model behavior may only be accessible …
a fully trained model. However, certain insights into model behavior may only be accessible …
Word order does matter and shuffled language models know it
Recent studies have shown that language models pretrained and/or fine-tuned on randomly
permuted sentences exhibit competitive performance on GLUE, putting into question the …
permuted sentences exhibit competitive performance on GLUE, putting into question the …
Probing for the usage of grammatical number
A central quest of probing is to uncover how pre-trained models encode a linguistic property
within their representations. An encoding, however, might be spurious-ie, the model might …
within their representations. An encoding, however, might be spurious-ie, the model might …
Language models as models of language
R Millière - arxiv preprint arxiv:2408.07144, 2024 - arxiv.org
This chapter critically examines the potential contributions of modern language models to
theoretical linguistics. Despite their focus on engineering goals, these models' ability to …
theoretical linguistics. Despite their focus on engineering goals, these models' ability to …