Understanding the societal impacts of machine translation: a critical review of the literature on medical and legal use cases
The ready availability of machine translation (MT) systems such as Google Translate has
profoundly changed how society engages with multilingual communication practices. In …
profoundly changed how society engages with multilingual communication practices. In …
The Eval4NLP shared task on explainable quality estimation: Overview and results
In this paper, we introduce the Eval4NLP-2021shared task on explainable quality
estimation. Given a source-translation pair, this shared task requires not only to provide a …
estimation. Given a source-translation pair, this shared task requires not only to provide a …
CometKiwi: IST-unbabel 2022 submission for the quality estimation shared task
We present the joint contribution of IST and Unbabel to the WMT 2022 Shared Task on
Quality Estimation (QE). Our team participated on all three subtasks:(i) Sentence and Word …
Quality Estimation (QE). Our team participated on all three subtasks:(i) Sentence and Word …
Data-driven sentence simplification: Survey and benchmark
Sentence Simplification (SS) aims to modify a sentence in order to make it easier to read
and understand. In order to do so, several rewriting transformations can be performed such …
and understand. In order to do so, several rewriting transformations can be performed such …
OpenKiwi: An open source framework for quality estimation
We introduce OpenKiwi, a PyTorch-based open source framework for translation quality
estimation. OpenKiwi supports training and testing of word-level and sentence-level quality …
estimation. OpenKiwi supports training and testing of word-level and sentence-level quality …
Machine translation decoding beyond beam search
Beam search is the go-to method for decoding auto-regressive machine translation models.
While it yields consistent improvements in terms of BLEU, it is only concerned with finding …
While it yields consistent improvements in terms of BLEU, it is only concerned with finding …
Infolm: A new metric to evaluate summarization & data2text generation
Assessing the quality of natural language generation (NLG) systems through human
annotation is very expensive. Additionally, human annotation campaigns are time …
annotation is very expensive. Additionally, human annotation campaigns are time …
MLQE-PE: A multilingual quality estimation and post-editing dataset
We present MLQE-PE, a new dataset for Machine Translation (MT) Quality Estimation (QE)
and Automatic Post-Editing (APE). The dataset contains eleven language pairs, with human …
and Automatic Post-Editing (APE). The dataset contains eleven language pairs, with human …
Towards explainable evaluation metrics for machine translation
Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for
machine translation (for example, COMET or BERTScore) are based on black-box large …
machine translation (for example, COMET or BERTScore) are based on black-box large …
Denoising pre-training for machine translation quality estimation with curriculum learning
Quality estimation (QE) aims to assess the quality of machine translations when reference
translations are unavailable. QE plays a crucial role in many real-world applications of …
translations are unavailable. QE plays a crucial role in many real-world applications of …