Ultrafeedback: Boosting language models with high-quality feedback

G Cui, L Yuan, N Ding, G Yao, W Zhu, Y Ni, G **e, Z Liu… - 2023 - openreview.net
Reinforcement learning from human feedback (RLHF) has become a pivot technique in
aligning large language models (LLMs) with human preferences. In RLHF practice …

Aligning large language models with human: A survey

Y Wang, W Zhong, L Li, F Mi, X Zeng, W Huang… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) trained on extensive textual corpora have emerged as
leading solutions for a broad array of Natural Language Processing (NLP) tasks. Despite …

Prometheus: Inducing fine-grained evaluation capability in language models

S Kim, J Shin, Y Cho, J Jang, S Longpre… - The Twelfth …, 2023 - openreview.net
Recently, GPT-4 has become the de facto evaluator for long-form text generated by large
language models (LLMs). However, for practitioners and researchers with large and custom …

Generative judge for evaluating alignment

J Li, S Sun, W Yuan, RZ Fan, H Zhao, P Liu - arxiv preprint arxiv …, 2023 - arxiv.org
The rapid development of Large Language Models (LLMs) has substantially expanded the
range of tasks they can address. In the field of Natural Language Processing (NLP) …

Training language models to self-correct via reinforcement learning

A Kumar, V Zhuang, R Agarwal, Y Su… - arxiv preprint arxiv …, 2024 - arxiv.org
Self-correction is a highly desirable capability of large language models (LLMs), yet it has
consistently been found to be largely ineffective in modern LLMs. Current methods for …

Flask: Fine-grained language model evaluation based on alignment skill sets

S Ye, D Kim, S Kim, H Hwang, S Kim, Y Jo… - arxiv preprint arxiv …, 2023 - arxiv.org
Evaluation of Large Language Models (LLMs) is challenging because aligning to human
values requires the composition of multiple skills and the required set of skills varies …

Shepherd: A critic for language model generation

T Wang, P Yu, XE Tan, S O'Brien, R Pasunuru… - arxiv preprint arxiv …, 2023 - arxiv.org
As large language models improve, there is increasing interest in techniques that leverage
these models' capabilities to refine their own outputs. In this work, we introduce Shepherd, a …

Internal consistency and self-feedback in large language models: A survey

X Liang, S Song, Z Zheng, H Wang, Q Yu, X Li… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) often exhibit deficient reasoning or generate hallucinations.
To address these, studies prefixed with" Self-" such as Self-Consistency, Self-Improve, and …

A survey on knowledge distillation of large language models

X Xu, M Li, C Tao, T Shen, R Cheng, J Li, C Xu… - arxiv preprint arxiv …, 2024 - arxiv.org
This survey presents an in-depth exploration of knowledge distillation (KD) techniques
within the realm of Large Language Models (LLMs), spotlighting the pivotal role of KD in …

Confidence matters: Revisiting intrinsic self-correction capabilities of large language models

L Li, Z Chen, G Chen, Y Zhang, Y Su, E **ng… - arxiv preprint arxiv …, 2024 - arxiv.org
The recent success of Large Language Models (LLMs) has catalyzed an increasing interest
in their self-correction capabilities. This paper presents a comprehensive investigation into …