Efficient methods for natural language processing: A survey
Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …
scaling model parameters and training data; however, using only scale to improve …
Quality-aware decoding for neural machine translation
Despite the progress in machine translation quality estimation and evaluation in the last
years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers …
years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers …
Understanding and detecting hallucinations in neural machine translation via model introspection
Neural sequence generation models are known to “hallucinate”, by producing outputs that
are unrelated to the source text. These hallucinations are potentially harmful, yet it remains …
are unrelated to the source text. These hallucinations are potentially harmful, yet it remains …
UniTE: Unified translation evaluation
Translation quality evaluation plays a crucial role in machine translation. According to the
input format, it is mainly separated into three tasks, ie, reference-only, source-only and …
input format, it is mainly separated into three tasks, ie, reference-only, source-only and …
Menli: Robust evaluation metrics from natural language inference
Recently proposed BERT-based evaluation metrics for text generation perform well on
standard benchmarks but are vulnerable to adversarial attacks, eg, relating to information …
standard benchmarks but are vulnerable to adversarial attacks, eg, relating to information …
Findings of the WMT 2021 shared task on quality estimation
We report the results of the WMT 2021 shared task on Quality Estimation, where the
challenge is to predict the quality of the output of neural machine translation systems at the …
challenge is to predict the quality of the output of neural machine translation systems at the …
Codescore: Evaluating code generation by learning code execution
A proper code evaluation metric (CEM) profoundly impacts the evolution of code generation,
which is an important research field in NLP and software engineering. Prevailing match …
which is an important research field in NLP and software engineering. Prevailing match …
Findings of the WMT 2023 shared task on quality estimation
We report the results of the WMT 2023 shared task on Quality Estimation, in which the
challenge is to predict the quality of the output of neural machine translation systems at the …
challenge is to predict the quality of the output of neural machine translation systems at the …
The inside story: Towards better understanding of machine translation neural evaluation metrics
Neural metrics for machine translation evaluation, such as COMET, exhibit significant
improvements in their correlation with human judgments, as compared to traditional metrics …
improvements in their correlation with human judgments, as compared to traditional metrics …
Towards explainable evaluation metrics for machine translation
Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for
machine translation (for example, COMET or BERTScore) are based on black-box large …
machine translation (for example, COMET or BERTScore) are based on black-box large …