- Academic Search

M Treviso, JU Lee, T Ji, B Aken, Q Cao… - Transactions of the …, 2023 - direct.mit.edu

Recent work in natural language processing (NLP) has yielded appealing results from
scaling model parameters and training data; however, using only scale to improve …

保存引用被引用数: 111 関連記事全 10 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Quality-aware decoding for neural machine translation

P Fernandes, A Farinhas, R Rei, JGC de Souza… - arxiv preprint arxiv …, 2022 - arxiv.org

Despite the progress in machine translation quality estimation and evaluation in the last
years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers …

保存引用被引用数: 79 関連記事全 7 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Understanding and detecting hallucinations in neural machine translation via model introspection

W Xu, S Agrawal, E Briakou, MJ Martindale… - Transactions of the …, 2023 - direct.mit.edu

Neural sequence generation models are known to “hallucinate”, by producing outputs that
are unrelated to the source text. These hallucinations are potentially harmful, yet it remains …

保存引用被引用数: 50 関連記事全 8 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

UniTE: Unified translation evaluation

Y Wan, D Liu, B Yang, H Zhang, B Chen… - arxiv preprint arxiv …, 2022 - arxiv.org

Translation quality evaluation plays a crucial role in machine translation. According to the
input format, it is mainly separated into three tasks, ie, reference-only, source-only and …

保存引用被引用数: 61 関連記事全 6 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

Menli: Robust evaluation metrics from natural language inference

Y Chen, S Eger - Transactions of the Association for Computational …, 2023 - direct.mit.edu

Recently proposed BERT-based evaluation metrics for text generation perform well on
standard benchmarks but are vulnerable to adversarial attacks, eg, relating to information …

保存引用被引用数: 38 関連記事全 8 バージョン

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Findings of the WMT 2021 shared task on quality estimation

L Specia, F Blain, M Fomicheva, C Zerva… - Proceedings of the …, 2021 - aclanthology.org

We report the results of the WMT 2021 shared task on Quality Estimation, where the
challenge is to predict the quality of the output of neural machine translation systems at the …

保存引用被引用数: 148 関連記事全 12 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Codescore: Evaluating code generation by learning code execution

Y Dong, J Ding, X Jiang, G Li, Z Li, Z ** - arxiv preprint arxiv:2301.09043, 2023 - arxiv.org

A proper code evaluation metric (CEM) profoundly impacts the evolution of code generation,
which is an important research field in NLP and software engineering. Prevailing match …

保存引用被引用数: 48 関連記事全 3 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] aclanthology.org

Findings of the WMT 2023 shared task on quality estimation

F Blain, C Zerva, R Rei, NM Guerreiro… - Proceedings of the …, 2023 - aclanthology.org

We report the results of the WMT 2023 shared task on Quality Estimation, in which the
challenge is to predict the quality of the output of neural machine translation systems at the …

保存引用被引用数: 26 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The inside story: Towards better understanding of machine translation neural evaluation metrics

R Rei, NM Guerreiro, M Treviso, L Coheur… - arxiv preprint arxiv …, 2023 - arxiv.org

Neural metrics for machine translation evaluation, such as COMET, exhibit significant
improvements in their correlation with human judgments, as compared to traditional metrics …

保存引用被引用数: 15 関連記事全 5 バージョン HTMLバージョン

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

Towards explainable evaluation metrics for machine translation

C Leiter, P Lertvittayakumjorn, M Fomicheva… - Journal of Machine …, 2024 - jmlr.org

Unlike classical lexical overlap metrics such as BLEU, most current evaluation metrics for
machine translation (for example, COMET or BERTScore) are based on black-box large …

保存引用被引用数: 11 関連記事全 5 バージョン HTMLバージョン

アラートを作成

引用

検索オプション

マイライブラリに保存しました

TransQuest: Translation quality estimation with cross-lingual transformers

Efficient methods for natural language processing: A survey

Quality-aware decoding for neural machine translation

Understanding and detecting hallucinations in neural machine translation via model introspection

UniTE: Unified translation evaluation

Menli: Robust evaluation metrics from natural language inference

Findings of the WMT 2021 shared task on quality estimation

Codescore: Evaluating code generation by learning code execution

Findings of the WMT 2023 shared task on quality estimation

The inside story: Towards better understanding of machine translation neural evaluation metrics

Towards explainable evaluation metrics for machine translation