Analysis methods in neural language processing: A survey
The field of natural language processing has seen impressive progress in recent years, with
neural network models replacing many of the traditional systems. A plethora of new models …
neural network models replacing many of the traditional systems. A plethora of new models …
Theoretical limitations of self-attention in neural sequence models
M Hahn - Transactions of the Association for Computational …, 2020 - direct.mit.edu
Transformers are emerging as the new workhorse of NLP, showing great success across
tasks. Unlike LSTMs, transformers process input sequences entirely through self-attention …
tasks. Unlike LSTMs, transformers process input sequences entirely through self-attention …
Thinking like transformers
What is the computational model behind a Transformer? Where recurrent neural networks
have direct parallels in finite state machines, allowing clear discussion and thought around …
have direct parallels in finite state machines, allowing clear discussion and thought around …
Self-attention networks can process bounded hierarchical languages
Despite their impressive performance in NLP, self-attention networks were recently proved
to be limited for processing formal languages with hierarchical structure, such as $\mathsf …
to be limited for processing formal languages with hierarchical structure, such as $\mathsf …
Do neural models learn systematicity of monotonicity inference in natural language?
Despite the success of language models using neural networks, it remains unclear to what
extent neural models have the generalization ability to perform inferences. In this paper, we …
extent neural models have the generalization ability to perform inferences. In this paper, we …
How can self-attention networks recognize Dyck-n languages?
We focus on the recognition of Dyck-n ($\mathcal {D} _n $) languages with self-attention
(SA) networks, which has been deemed to be a difficult task for these networks. We compare …
(SA) networks, which has been deemed to be a difficult task for these networks. We compare …
Evaluating the ability of LSTMs to learn context-free grammars
L Sennhauser, RC Berwick - arxiv preprint arxiv:1811.02611, 2018 - arxiv.org
While long short-term memory (LSTM) neural net architectures are designed to capture
sequence information, human language is generally composed of hierarchical structures …
sequence information, human language is generally composed of hierarchical structures …
Memory-augmented recurrent neural networks can learn generalized dyck languages
We introduce three memory-augmented Recurrent Neural Networks (MARNNs) and explore
their capabilities on a series of simple language modeling tasks whose solutions require …
their capabilities on a series of simple language modeling tasks whose solutions require …
Formal and empirical studies of counting behaviour in ReLU RNNs
In recent years, the discussion about systematicity of neural network learning has gained
renewed interest, in particular the formal analysis of neural network behaviour. In this paper …
renewed interest, in particular the formal analysis of neural network behaviour. In this paper …
Learning the Dyck language with attention-based Seq2Seq models
The generalized Dyck language has been used to analyze the ability of Recurrent Neural
Networks (RNNs) to learn context-free grammars (CFGs). Recent studies draw conflicting …
Networks (RNNs) to learn context-free grammars (CFGs). Recent studies draw conflicting …