Chain-of-thought reasoning without prompting

X Wang, D Zhou - Advances in Neural Information …, 2025 - proceedings.neurips.cc
In enhancing the reasoning capabilities of large language models (LLMs), prior research
primarily focuses on specific prompting techniques such as few-shot or zero-shot chain-of …

Contrastive decoding: Open-ended text generation as optimization

XL Li, A Holtzman, D Fried, P Liang, J Eisner… - arxiv preprint arxiv …, 2022 - arxiv.org
Given a language model (LM), maximum probability is a poor decoding objective for open-
ended generation, because it produces short and repetitive text. On the other hand …

The unreasonable effectiveness of few-shot learning for machine translation

X Garcia, Y Bansal, C Cherry, G Foster… - International …, 2023 - proceedings.mlr.press
We demonstrate the potential of few-shot translation systems, trained with unpaired
language data, for both high and low-resource language pairs. We show that with only 5 …

Mauve: Measuring the gap between neural text and human text using divergence frontiers

K Pillutla, S Swayamdipta, R Zellers… - Advances in …, 2021 - proceedings.neurips.cc
As major progress is made in open-ended text generation, measuring how close machine-
generated text is to human language remains a critical open problem. We introduce Mauve …

Survey of low-resource machine translation

B Haddow, R Bawden, AVM Barone, J Helcl… - Computational …, 2022 - direct.mit.edu
We present a survey covering the state of the art in low-resource machine translation (MT)
research. There are currently around 7,000 languages spoken in the world and almost all …

Locally typical sampling

C Meister, T Pimentel, G Wiher… - Transactions of the …, 2023 - direct.mit.edu
Today's probabilistic language generators fall short when it comes to producing coherent
and fluent text despite the fact that the underlying models perform well under standard …

Accelerating transformer inference for translation via parallel decoding

A Santilli, S Severino, E Postolache, V Maiorca… - arxiv preprint arxiv …, 2023 - arxiv.org
Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT).
The community proposed specific network architectures and learning-based methods to …

Natural language to code translation with execution

F Shi, D Fried, M Ghazvininejad, L Zettlemoyer… - arxiv preprint arxiv …, 2022 - arxiv.org
Generative models of code, pretrained on large corpora of programs, have shown great
success in translating natural language to code (Chen et al., 2021; Austin et al., 2021; Li et …

Quality-aware decoding for neural machine translation

P Fernandes, A Farinhas, R Rei, JGC de Souza… - arxiv preprint arxiv …, 2022 - arxiv.org
Despite the progress in machine translation quality estimation and evaluation in the last
years, decoding in neural machine translation (NMT) is mostly oblivious to this and centers …

Uncertainty estimation in autoregressive structured prediction

A Malinin, M Gales - arxiv preprint arxiv:2002.07650, 2020 - arxiv.org
Uncertainty estimation is important for ensuring safety and robustness of AI systems. While
most research in the area has focused on un-structured prediction tasks, limited work has …