- Academic Search

S Wang, Y Zhu, H Liu, Z Zheng, C Chen, J Li - ACM Computing Surveys, 2024 - dl.acm.org

Large Language Models (LLMs) have recently transformed both the academic and industrial
landscapes due to their remarkable capacity to understand, analyze, and generate texts …

Save Cite Cited by 103 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] Decoding ChatGPT: a taxonomy of existing research, current challenges, and possible future directions

SS Sohail, F Farhat, Y Himeur, M Nadeem… - Journal of King Saud …, 2023 - Elsevier

Abstract Chat Generative Pre-trained Transformer (ChatGPT) has gained significant interest
and attention since its launch in November 2022. It has shown impressive performance in …

Save Cite Cited by 182 Related articles All 6 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Alpacafarm: A simulation framework for methods that learn from human feedback

Y Dubois, CX Li, R Taori, T Zhang… - Advances in …, 2024 - proceedings.neurips.cc

Large language models (LLMs) such as ChatGPT have seen widespread adoption due to
their ability to follow user instructions well. Develo** these LLMs involves a complex yet …

Save Cite Cited by 428 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Using large language models to simulate multiple humans and replicate human subject studies

GV Aher, RI Arriaga, AT Kalai - International Conference on …, 2023 - proceedings.mlr.press

We introduce a new type of test, called a Turing Experiment (TE), for evaluating to what
extent a given language model, such as GPT models, can simulate different aspects of …

Save Cite Cited by 444 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arxiv preprint arxiv …, 2023 - arxiv.org

Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

Save Cite Cited by 439 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024 - arxiv.org

This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

Save Cite Cited by 120 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Evaluating verifiability in generative search engines

NF Liu, T Zhang, P Liang - arxiv preprint arxiv:2304.09848, 2023 - arxiv.org

Generative search engines directly generate responses to user queries, along with in-line
citations. A prerequisite trait of a trustworthy generative search engine is verifiability, ie …

Save Cite Cited by 190 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] openreview.net

Towards understanding sycophancy in language models

M Sharma, M Tong, T Korbak, D Duvenaud… - arxiv preprint arxiv …, 2023 - arxiv.org

Human feedback is commonly utilized to finetune AI assistants. But human feedback may
also encourage model responses that match user beliefs over truthful ones, a behaviour …

Save Cite Cited by 170 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Evaluating the moral beliefs encoded in llms

N Scherrer, C Shi, A Feder… - Advances in Neural …, 2024 - proceedings.neurips.cc

This paper presents a case study on the design, administration, post-processing, and
evaluation of surveys on large language models (LLMs). It comprises two components:(1) A …

Save Cite Cited by 99 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] thecvf.com

Diffusion model alignment using direct preference optimization

B Wallace, M Dang, R Rafailov… - Proceedings of the …, 2024 - openaccess.thecvf.com

Large language models (LLMs) are fine-tuned using human comparison data with
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …

Save Cite Cited by 128 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Fine-tuning language models to find agreement among humans with diverse preferences

Knowledge editing for large language models: A survey

[HTML][HTML] Decoding ChatGPT: a taxonomy of existing research, current challenges, and possible future directions

Alpacafarm: A simulation framework for methods that learn from human feedback

Using large language models to simulate multiple humans and replicate human subject studies

Open problems and fundamental limitations of reinforcement learning from human feedback

Foundational challenges in assuring alignment and safety of large language models

Evaluating verifiability in generative search engines

Towards understanding sycophancy in language models

Evaluating the moral beliefs encoded in llms

Diffusion model alignment using direct preference optimization