Learning to summarize with human feedback

N Stiennon, L Ouyang, J Wu… - Advances in …, 2020 - proceedings.neurips.cc
As language models become more powerful, training and evaluation are increasingly
bottlenecked by the data and metrics used for a particular task. For example, summarization …

Symbolic chain-of-thought distillation: Small models can also" think" step-by-step

LH Li, J Hessel, Y Yu, X Ren, KW Chang… - arxiv preprint arxiv …, 2023 - arxiv.org
Chain-of-thought prompting (eg," Let's think step-by-step") primes large language models to
verbalize rationalization for their predictions. While chain-of-thought can lead to dramatic …

Improving code generation by training with natural language feedback

A Chen, J Scheurer, T Korbak, JA Campos… - arxiv preprint arxiv …, 2023 - arxiv.org
The potential for pre-trained large language models (LLMs) to use natural language
feedback at inference time has been an exciting recent development. We build upon this …

Lirex: Augmenting language inference with relevant explanations

X Zhao, VGV Vydiswaran - Proceedings of the AAAI Conference on …, 2021 - ojs.aaai.org
Natural language explanations (NLEs) are a special form of data annotation in which
annotators identify rationales (most significant text tokens) when assigning labels to data …

Towards interpretable natural language understanding with explanations as latent variables

W Zhou, J Hu, H Zhang, X Liang… - Advances in …, 2020 - proceedings.neurips.cc
Recently generating natural language explanations has shown very promising results in not
only offering interpretable explanations but also providing additional information and …

Relationship-embedded representation learning for grounding referring expressions

S Yang, G Li, Y Yu - IEEE Transactions on Pattern Analysis and …, 2020 - ieeexplore.ieee.org
Grounding referring expressions in images aims to locate the object instance in an image
described by a referring expression. It involves a joint understanding of natural language …

Training language models with language feedback

J Scheurer, JA Campos, JS Chan, A Chen… - arxiv preprint arxiv …, 2022 - arxiv.org
Pretrained language models often do not perform tasks in ways that are in line with our
preferences, eg, generating offensive text or factually incorrect summaries. Recent work …

Training language models with language feedback

JA Campos, J Shern - ACL Workshop on Learning with Natural …, 2022 - par.nsf.gov
Pretrained language models often do not perform tasks in ways that are in line with our
preferences, eg, generating offensive text or factually incorrect summaries. Recent work …

What if you said that differently?: How Explanation Formats Affect Human Feedback Efficacy and User Perception

C Malaviya, S Lee, D Roth… - Proceedings of the 2024 …, 2024 - aclanthology.org
Eliciting feedback from end users of NLP models can be beneficial for improving models.
However, how should we present model responses to users so they are most amenable to …

Unpaired image captioning by image-level weakly-supervised visual concept recognition

P Zhu, X Wang, Y Luo, Z Sun, WS Zheng… - IEEE Transactions …, 2022 - ieeexplore.ieee.org
The goal of unpaired image captioning (UIC) is to describe images without using image-
caption pairs in the training phase. Although challenging, we expect the task can be …