How to evaluate machine translation: A review of automated and human metrics
E Chatzikoumi - Natural Language Engineering, 2020 - cambridge.org
This article presents the most up-to-date, influential automated, semiautomated and human
metrics used to evaluate the quality of machine translation (MT) output and provides the …
metrics used to evaluate the quality of machine translation (MT) output and provides the …
A comprehensive survey on various fully automatic machine translation evaluation metrics
The fast advancement in machine translation models necessitates the development of
accurate evaluation metrics that would allow researchers to track the progress in text …
accurate evaluation metrics that would allow researchers to track the progress in text …
Why we need new evaluation metrics for NLG
The majority of NLG evaluation relies on automatic metrics, such as BLEU. In this paper, we
motivate the need for novel, system-and data-independent automatic evaluation methods …
motivate the need for novel, system-and data-independent automatic evaluation methods …
Automatic machine translation evaluation in many languages via zero-shot paraphrasing
We frame the task of machine translation evaluation as one of scoring machine translation
output with a sequence-to-sequence paraphraser, conditioned on a human reference. We …
output with a sequence-to-sequence paraphraser, conditioned on a human reference. We …
Results of the WMT19 metrics shared task: Segment-level and strong MT systems pose big challenges
This paper presents the results of the WMT19 Metrics Shared Task. Participants were asked
to score the outputs of the translations systems competing in the WMT19 News Translation …
to score the outputs of the translations systems competing in the WMT19 News Translation …
Translation quality assessment: A brief survey on manual and automatic methods
To facilitate effective translation modeling and translation studies, one of the crucial
questions to address is how to assess translation quality. From the perspectives of accuracy …
questions to address is how to assess translation quality. From the perspectives of accuracy …
A global analysis of metrics used for measuring performance in natural language processing
Measuring the performance of natural language processing models is challenging.
Traditionally used metrics, such as BLEU and ROUGE, originally devised for machine …
Traditionally used metrics, such as BLEU and ROUGE, originally devised for machine …
Adequacy–fluency metrics: Evaluating mt in the continuous space model framework
This work extends and evaluates a two-dimensional automatic evaluation metric for machine
translation, which is designed to operate at the sentence level. The metric is based on the …
translation, which is designed to operate at the sentence level. The metric is based on the …
A critical analysis of metrics used for measuring progress in artificial intelligence
Comparing model performances on benchmark datasets is an integral part of measuring
and driving progress in artificial intelligence. A model's performance on a benchmark …
and driving progress in artificial intelligence. A model's performance on a benchmark …
SBSim: A sentence-BERT similarity-based evaluation metric for indian language neural machine translation systems
Machine translation (MT) outputs are widely scored using automatic evaluation metrics and
human evaluation scores. The automatic evaluation metrics are expected to be easily …
human evaluation scores. The automatic evaluation metrics are expected to be easily …