Recent advances in recurrent neural networks

H Salehinejad, S Sankar, J Barfett, E Colak… - arxiv preprint arxiv …, 2017 - arxiv.org
Recurrent neural networks (RNNs) are capable of learning features and long term
dependencies from sequential and time-series data. The RNNs have a stack of non-linear …

A primer on neural network models for natural language processing

Y Goldberg - Journal of Artificial Intelligence Research, 2016 - jair.org
Over the past few years, neural networks have re-emerged as powerful machine-learning
models, yielding state-of-the-art results in fields such as image recognition and speech …

[LIBRO][B] Neural network methods in natural language processing

Y Goldberg - 2017 - books.google.com
Neural networks are a family of powerful machine learning models and this book focuses on
their application to natural language data. The first half of the book (Parts I and II) covers the …

Enriching word vectors with subword information

P Bojanowski, E Grave, A Joulin… - Transactions of the …, 2017 - direct.mit.edu
Continuous word representations, trained on large unlabeled corpora are useful for many
natural language processing tasks. Popular models that learn such representations ignore …

Character-aware neural language models

Y Kim, Y Jernite, D Sontag, A Rush - … of the AAAI conference on artificial …, 2016 - ojs.aaai.org
We describe a simple neural language model that relies only on character-level inputs.
Predictions are still made at the word-level. Our model employs a convolutional neural …

75 languages, 1 model: Parsing universal dependencies universally

D Kondratyuk, M Straka - arxiv preprint arxiv:1904.02099, 2019 - arxiv.org
We present UDify, a multilingual multi-task model capable of accurately predicting universal
part-of-speech, morphological features, lemmas, and dependency trees simultaneously for …

Globally normalized transition-based neural networks

D Andor, C Alberti, D Weiss, A Severyn… - arxiv preprint arxiv …, 2016 - arxiv.org
We introduce a globally normalized transition-based neural network model that achieves
state-of-the-art part-of-speech tagging, dependency parsing and sentence compression …

Finding function in form: Compositional character models for open vocabulary word representation

W Ling, T Luís, L Marujo, RF Astudillo, S Amir… - arxiv preprint arxiv …, 2015 - arxiv.org
We introduce a model for constructing vector representations of words by composing
characters using bidirectional LSTMs. Relative to traditional word representation models that …

Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss

B Plank, A Søgaard, Y Goldberg - arxiv preprint arxiv:1604.05529, 2016 - arxiv.org
Bidirectional long short-term memory (bi-LSTM) networks have recently proven successful
for various NLP sequence modeling tasks, but little is known about their reliance to input …

Achieving open vocabulary neural machine translation with hybrid word-character models

MT Luong, CD Manning - arxiv preprint arxiv:1604.00788, 2016 - arxiv.org
Nearly all previous work on neural machine translation (NMT) has used quite restricted
vocabularies, perhaps with a subsequent method to patch in unknown words. This paper …