Imagereward: Learning and evaluating human preferences for text-to-image generation
We present a comprehensive solution to learn and improve text-to-image models from
human preference feedback. To begin with, we build ImageReward---the first general …
human preference feedback. To begin with, we build ImageReward---the first general …
Training language models to follow instructions with human feedback
Making language models bigger does not inherently make them better at following a user's
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
intent. For example, large language models can generate outputs that are untruthful, toxic, or …
Aligning text-to-image models using human feedback
Deep generative models have shown impressive results in text-to-image synthesis.
However, current text-to-image models often generate images that are inadequately aligned …
However, current text-to-image models often generate images that are inadequately aligned …
Bridging the gap: A survey on integrating (human) feedback for natural language generation
Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …
large language models on vast internet-scale datasets. Despite these advancements, there …
Learning to summarize with human feedback
As language models become more powerful, training and evaluation are increasingly
bottlenecked by the data and metrics used for a particular task. For example, summarization …
bottlenecked by the data and metrics used for a particular task. For example, summarization …
Recursively summarizing books with human feedback
A major challenge for scaling machine learning is training models to perform tasks that are
very difficult or time-consuming for humans to evaluate. We present progress on this …
very difficult or time-consuming for humans to evaluate. We present progress on this …
Dress: Instructing large vision-language models to align and interact with humans via natural language feedback
We present DRESS a large vision language model (LVLM) that innovatively exploits Natural
Language feedback (NLF) from Large Language Models to enhance its alignment and …
Language feedback (NLF) from Large Language Models to enhance its alignment and …