Data-driven sentence simplification: Survey and benchmark

F Alva-Manchego, C Scarton, L Specia - Computational Linguistics, 2020 - direct.mit.edu
Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read
and understand. In order to do so, several rewriting transformations can be performed such …

Corpora generation for grammatical error correction

J Lichtarge, C Alberti, S Kumar, N Shazeer… - arxiv preprint arxiv …, 2019 - arxiv.org
Grammatical Error Correction (GEC) has been recently modeled using the sequence-to-
sequence framework. However, unlike sequence transduction problems such as machine …

MUSS: Multilingual unsupervised sentence simplification by mining paraphrases

L Martin, A Fan, E De La Clergerie, A Bordes… - arxiv preprint arxiv …, 2020 - arxiv.org
Progress in sentence simplification has been hindered by a lack of labeled parallel
simplification data, particularly in languages other than English. We introduce MUSS, a …

Revisiting non-English text simplification: A unified multilingual benchmark

MJ Ryan, T Naous, W Xu - arxiv preprint arxiv:2305.15678, 2023 - arxiv.org
Recent advancements in high-quality, large-scale English resources have pushed the
frontier of English Automatic Text Simplification (ATS) research. However, less work has …

Learning to split and rephrase from Wikipedia edit history

JA Botha, M Faruqui, J Alex, J Baldridge… - arxiv preprint arxiv …, 2018 - arxiv.org
Split and rephrase is the task of breaking down a sentence into shorter ones that together
convey the same meaning. We extract a rich new dataset for this task by mining Wikipedia's …

Multilingual unsupervised sentence simplification

L Martin, A Fan, EV de La Clergerie, A Bordes, B Sagot - 2021 - inria.hal.science
Progress in Sentence Simplification has been hindered by the lack of supervised data,
particularly in languages other than English. Previous work has aligned sentences from …

LexFit: Lexical fine-tuning of pretrained language models

I Vulić, EM Ponti, A Korhonen… - Proceedings of the 59th …, 2021 - aclanthology.org
Transformer-based language models (LMs) pretrained on large text collections implicitly
store a wealth of lexical semantic knowledge, but it is non-trivial to extract that knowledge …

Neural readability pairwise ranking for sentences in Italian administrative language

M Miliani, S Auriemma… - Proceedings of the …, 2022 - aclanthology.org
Abstract Automatic Readability Assessment aims at assigning a complexity level to a given
text, which could help improve the accessibility to information in specific domains, such as …

Gemv2: Multilingual nlg benchmarking in a single line of code

S Gehrmann, A Bhattacharjee, A Mahendiran… - arxiv preprint arxiv …, 2022 - arxiv.org
Evaluation in machine learning is usually informed by past choices, for example which
datasets or metrics to use. This standardization enables the comparison on equal footing …