Analysis methods in neural language processing: A survey

Y Belinkov, J Glass - … of the Association for Computational Linguistics, 2019 - direct.mit.edu
The field of natural language processing has seen impressive progress in recent years, with
neural network models replacing many of the traditional systems. A plethora of new models …

Pretraining with artificial language: Studying transferable knowledge in language models

R Ri, Y Tsuruoka - arxiv preprint arxiv:2203.10326, 2022 - arxiv.org
We investigate what kind of structural knowledge learned in neural network encoders is
transferable to processing natural language. We design artificial languages with structural …

[HTML][HTML] What do end-to-end speech models learn about speaker, language and channel information? a layer-wise and neuron-level analysis

SA Chowdhury, N Durrani, A Ali - Computer Speech & Language, 2024 - Elsevier
Deep neural networks are inherently opaque and challenging to interpret. Unlike hand-
crafted feature-based models, we struggle to comprehend the concepts learned and how …

On evaluating the generalization of LSTM models in formal languages

M Suzgun, Y Belinkov, SM Shieber - arxiv preprint arxiv:1811.01001, 2018 - arxiv.org
Recurrent Neural Networks (RNNs) are theoretically Turing-complete and established
themselves as a dominant model for language processing. Yet, there still remains an …

LSTMs compose (and learn) bottom-up

N Saphra, A Lopez - arxiv preprint arxiv:2010.04650, 2020 - arxiv.org
Recent work in NLP shows that LSTM language models capture hierarchical structure in
language data. In contrast to existing work, we consider the\textit {learning} process that …

Diversity as a by-product: Goal-oriented language generation leads to linguistic variation

S Schüz, T Han, S Zarrieß - … of the 22nd Annual Meeting of the …, 2021 - aclanthology.org
The ability for variation in language use is necessary for speakers to achieve their
conversational goals, for instance when referring to objects in visual environments. We …

Automatically Extracting Challenge Sets for Non local Phenomena in Neural Machine Translation

L Choshen, O Abend - arxiv preprint arxiv:1909.06814, 2019 - arxiv.org
We show that the state of the art Transformer Machine Translation (MT) model is not biased
towards monotonic reordering (unlike previous recurrent neural network models), but that …

[HTML][HTML] Analogical inference from distributional structure: What recurrent neural networks can tell us about word learning

PA Huebner, JA Willits - Machine Learning with Applications, 2023 - Elsevier
One proposal that can explain the remarkable pace of word learning in young children is
that they leverage the language-internal distributional similarity of familiar and novel words …

Language models learn POS first

N Saphra, A Lopez - … and Interpreting Neural Networks for NLP, 2018 - research.ed.ac.uk
A glut of recent research shows that language models capture linguistic structure. Linzen et
al.(2016) found that LSTM-based language models may encode syntactic information …

How LSTM encodes syntax: Exploring context vectors and semi-quantization on natural text

C Shibata, K Uchiumi, D Mochihashi - arxiv preprint arxiv:2010.00363, 2020 - arxiv.org
Long Short-Term Memory recurrent neural network (LSTM) is widely used and known to
capture informative long-term syntactic dependencies. However, how such information are …