Analysis methods in neural language processing: A survey

Y Belinkov, J Glass - … of the Association for Computational Linguistics, 2019‏ - direct.mit.edu
The field of natural language processing has seen impressive progress in recent years, with
neural network models replacing many of the traditional systems. A plethora of new models …

Paradigm shift in natural language processing

TX Sun, XY Liu, XP Qiu, XJ Huang - Machine Intelligence Research, 2022‏ - Springer
In the era of deep learning, modeling for most natural language processing (NLP) tasks has
converged into several mainstream paradigms. For example, we usually adopt the …

Multitask prompted training enables zero-shot task generalization

V Sanh, A Webson, C Raffel, SH Bach… - arxiv preprint arxiv …, 2021‏ - arxiv.org
Large language models have recently been shown to attain reasonable zero-shot
generalization on a diverse set of tasks (Brown et al., 2020). It has been hypothesized that …

Transformers: State-of-the-art natural language processing

T Wolf, L Debut, V Sanh, J Chaumond… - Proceedings of the …, 2020‏ - aclanthology.org
Recent progress in natural language processing has been driven by advances in both
model architecture and model pretraining. Transformer architectures have facilitated …

Roberta: A robustly optimized bert pretraining approach

Y Liu, M Ott, N Goyal, J Du, M Joshi, D Chen… - arxiv preprint arxiv …, 2019‏ - arxiv.org
Language model pretraining has led to significant performance gains but careful
comparison between different approaches is challenging. Training is computationally …

BERT rediscovers the classical NLP pipeline

I Tenney, D Das, E Pavlick - arxiv preprint arxiv:1905.05950, 2019‏ - arxiv.org
Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks. We
focus on one such model, BERT, and aim to quantify where linguistic information is captured …

Boolq: Exploring the surprising difficulty of natural yes/no questions

C Clark, K Lee, MW Chang, T Kwiatkowski… - arxiv preprint arxiv …, 2019‏ - arxiv.org
In this paper we study yes/no questions that are naturally occurring---meaning that they are
generated in unprompted and unconstrained settings. We build a reading comprehension …

Superglue: A stickier benchmark for general-purpose language understanding systems

A Wang, Y Pruksachatkun, N Nangia… - Advances in neural …, 2019‏ - proceedings.neurips.cc
In the last year, new models and methods for pretraining and transfer learning have driven
striking performance improvements across a range of language understanding tasks. The …

What BERT is not: Lessons from a new suite of psycholinguistic diagnostics for language models

A Ettinger - Transactions of the Association for Computational …, 2020‏ - direct.mit.edu
Pre-training by language modeling has become a popular and successful approach to NLP
tasks, but we have yet to understand exactly what linguistic capacities these pre-training …

Right for the wrong reasons: Diagnosing syntactic heuristics in natural language inference

RT McCoy, E Pavlick, T Linzen - arxiv preprint arxiv:1902.01007, 2019‏ - arxiv.org
A machine learning system can score well on a given test set by relying on heuristics that
are effective for frequent example types but break down in more challenging cases. We …