Flaubert: Unsupervised language model pre-training for french

H Le, L Vial, J Frej, V Segonne, M Coavoux… - arxiv preprint arxiv …, 2019 - arxiv.org
Language models have become a key step to achieve state-of-the art results in many
different Natural Language Processing (NLP) tasks. Leveraging the huge amount of …

Constituency parsing with a self-attentive encoder

N Kitaev, D Klein - arxiv preprint arxiv:1805.01052, 2018 - arxiv.org
We demonstrate that replacing an LSTM encoder with a self-attentive architecture can lead
to improvements to a state-of-the-art discriminative constituency parser. The use of attention …

Multilingual constituency parsing with self-attention and pre-training

N Kitaev, S Cao, D Klein - arxiv preprint arxiv:1812.11760, 2018 - arxiv.org
We show that constituency parsing benefits from unsupervised pre-training across a variety
of languages and a range of pre-training conditions. We first compare the benefits of no pre …

Multiword expression processing: A survey

M Constant, G Eryiğit, J Monti, L Van Der Plas… - Computational …, 2017 - direct.mit.edu
Multiword expressions (MWEs) are a class of linguistic forms spanning conventional word
boundaries that are both idiosyncratic and pervasive across different languages. The …

Improved transition-based parsing by modeling characters instead of words with LSTMs

M Ballesteros, C Dyer, NA Smith - arxiv preprint arxiv:1508.00657, 2015 - arxiv.org
We present extensions to a continuous-state dependency parsing method that makes it
applicable to morphologically rich languages. Starting with a high-performance transition …

[PDF][PDF] It depends: Dependency parser comparison using a web-based evaluation tool

JD Choi, J Tetreault, A Stent - … of the 53rd Annual Meeting of the …, 2015 - aclanthology.org
The last few years have seen a surge in the number of accurate, fast, publicly available
dependency parsers. At the same time, the use of dependency parsing in NLP applications …

Joint lemmatization and morphological tagging with lemming

T Muller, R Cotterell, A Fraser, H Schütze - arxiv preprint arxiv …, 2024 - arxiv.org
We present LEMMING, a modular log-linear model that jointly models lemmatization and
tagging and supports the integration of arbitrary global features. It is trainable on corpora …

Machine learning and Hebrew NLP for automated assessment of open-ended questions in biology

M Ariely, T Nazaretsky, G Alexandron - International journal of artificial …, 2023 - Springer
Abstract Machine learning algorithms that automatically score scientific explanations can be
used to measure students' conceptual understanding, identify gaps in their reasoning, and …

AlephBERT: Language model pre-training and evaluation from sub-word to sentence level

A Seker, E Bandel, D Bareket… - Proceedings of the …, 2022 - aclanthology.org
Abstract Large Pre-trained Language Models (PLMs) have become ubiquitous in the
development of language understanding technology and lie at the heart of many artificial …

Neural CRF parsing

G Durrett, D Klein - arxiv preprint arxiv:1507.03641, 2015 - arxiv.org
This paper describes a parsing model that combines the exact dynamic programming of
CRF parsing with the rich nonlinear featurization of neural net approaches. Our model is …