Large Language Models Can Self-Improve in Long-context Reasoning

S Li, C Yang, Z Cheng, L Liu, M Yu, Y Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have achieved substantial progress in processing long
contexts but still struggle with long-context reasoning. Existing approaches typically involve …

Memorization Inheritance in Sequence-Level Knowledge Distillation for Neural Machine Translation

V Dankers, V Raunak - arxiv preprint arxiv:2502.01491, 2025 - arxiv.org
In this work, we explore how instance-level memorization in the teacher Neural Machine
Translation (NMT) model gets inherited by the student model in sequence-level knowledge …

A Bayesian Optimization Approach to Machine Translation Reranking

J Cheng, M Züfle, V Zouhar, A Vlachos - arxiv preprint arxiv:2411.09694, 2024 - arxiv.org
Reranking a list of candidates from a machine translation system with an external scoring
model and returning the highest-scoring candidate remains a simple and effective method …