- Academic Search

L Qin, Q Chen, Y Zhou, Z Chen, Y Li, L Liao… - ar** accurate and safe
large language models. This is no exception in machine translation (MT), where better …

[Free GPT-4]

[PDF] arxiv.org

Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization

K Uhlig, J Wuebker, R Reinauer, J DeNero - arxiv preprint arxiv …, 2024 - arxiv.org

Reinforcement Learning from Human Feedback (RLHF) and derivative techniques like
Direct Preference Optimization (DPO) are task-alignment algorithms used to repurpose …

[Free GPT-4]

[PDF] arxiv.org

CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation

G Cui, P Wang, Y Liu, Z Ke, Z Liu, V Bhat - arxiv preprint arxiv:2501.13927, 2025 - arxiv.org

Large language models (LLMs) have shown great potential in natural language processing
tasks, but their application to machine translation (MT) remains challenging due to …

[Free GPT-4]

[PDF] arxiv.org

Can Automatic Metrics Assess High-Quality Translations?

S Agrawal, A Farinhas, R Rei, AFT Martins - arxiv preprint arxiv …, 2024 - arxiv.org

Automatic metrics for evaluating translation quality are typically validated by measuring how
well they correlate with human assessments. However, correlation methods tend to capture …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Direct preference optimization for neural machine translation with minimum bayes risk decoding

Multilingual large language model: A survey of resources, taxonomy and frontiers

Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization

CRPO: Confidence-Reward Driven Preference Optimization for Machine Translation

Can Automatic Metrics Assess High-Quality Translations?