Semantic structure in deep learning

E Pavlick - Annual Review of Linguistics, 2022 - annualreviews.org
Deep learning has recently come to dominate computational linguistics, leading to claims of
human-level performance in a range of language processing tasks. Like much previous …

Masked language modeling and the distributional hypothesis: Order word matters pre-training for little

K Sinha, R Jia, D Hupkes, J Pineau, A Williams… - arxiv preprint arxiv …, 2021 - arxiv.org
A possible explanation for the impressive performance of masked language model (MLM)
pre-training is that such models have learned to represent the syntactic structures prevalent …

What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models

A Ettinger - Transactions of the Association for Computational …, 2020 - direct.mit.edu
Pre-training by language modeling has become a popular and successful approach to NLP
tasks, but we have yet to understand exactly what linguistic capacities these pre-training …

[PDF][PDF] Linguistic Knowledge and Transferability of Contextual Representations

NF Liu - arxiv preprint arxiv:1903.08855, 2019 - fq.pkwyx.com
Contextual word representations derived from large-scale neural language models are
successful across a diverse set of NLP tasks, suggesting that they encode useful and …

Interpreting graph neural networks for NLP with differentiable edge masking

MS Schlichtkrull, N De Cao, I Titov - arxiv preprint arxiv:2010.00577, 2020 - arxiv.org
Graph neural networks (GNNs) have become a popular approach to integrating structural
inductive biases into NLP models. However, there has been little work on interpreting them …

BLiMP: The benchmark of linguistic minimal pairs for English

A Warstadt, A Parrish, H Liu, A Mohananey… - Transactions of the …, 2020 - direct.mit.edu
Abstract We introduce The Benchmark of Linguistic Minimal Pairs (BLiMP), a challenge set
for evaluating the linguistic knowledge of language models (LMs) on major grammatical …

State-of-the-art generalisation research in NLP: a taxonomy and review

D Hupkes, M Giulianelli, V Dankers, M Artetxe… - arxiv preprint arxiv …, 2022 - arxiv.org
The ability to generalise well is one of the primary desiderata of natural language
processing (NLP). Yet, what'good generalisation'entails and how it should be evaluated is …

Local interpretations for explainable natural language processing: A survey

S Luo, H Ivison, SC Han, J Poon - ACM Computing Surveys, 2024 - dl.acm.org
As the use of deep learning techniques has grown across various fields over the past
decade, complaints about the opaqueness of the black-box models have increased …

The emergence of number and syntax units in LSTM language models

Y Lakretz, G Kruszewski, T Desbordes… - arxiv preprint arxiv …, 2019 - arxiv.org
Recent work has shown that LSTMs trained on a generic language modeling objective
capture syntax-sensitive generalizations such as long-distance number agreement. We …

Understanding by understanding not: Modeling negation in language models

A Hosseini, S Reddy, D Bahdanau, RD Hjelm… - arxiv preprint arxiv …, 2021 - arxiv.org
Negation is a core construction in natural language. Despite being very successful on many
tasks, state-of-the-art pre-trained language models often handle negation incorrectly. To …