The benefits, risks and bounds of personalizing the alignment of large language models to individuals

HR Kirk, B Vidgen, P Röttger, SA Hale - Nature Machine Intelligence, 2024 - nature.com
Large language models (LLMs) undergo 'alignment'so that they better reflect human values
or preferences, and are safer or more useful. However, alignment is intrinsically difficult …

From matching to generation: A survey on generative information retrieval

X Li, J **, Y Zhou, Y Zhang, P Zhang, Y Zhu… - arxiv preprint arxiv …, 2024 - arxiv.org
Information Retrieval (IR) systems are crucial tools for users to access information, widely
applied in scenarios like search engines, question answering, and recommendation …

Prometheus 2: An open source language model specialized in evaluating other language models

S Kim, J Suk, S Longpre, BY Lin, J Shin… - arxiv preprint arxiv …, 2024 - arxiv.org
Proprietary LMs such as GPT-4 are often employed to assess the quality of responses from
various LMs. However, concerns including transparency, controllability, and affordability …

The prism alignment project: What participatory, representative and individualised human feedback reveals about the subjective and multicultural alignment of large …

HR Kirk, A Whitefield, P Röttger, A Bean… - arxiv preprint arxiv …, 2024 - arxiv.org
Human feedback plays a central role in the alignment of Large Language Models (LLMs).
However, open questions remain about the methods (how), domains (where), people (who) …

Arithmetic control of llms for diverse user preferences: Directional preference alignment with multi-objective rewards

H Wang, Y Lin, W **ong, R Yang, S Diao, S Qiu… - arxiv preprint arxiv …, 2024 - arxiv.org
Fine-grained control over large language models (LLMs) remains a significant challenge,
hindering their adaptability to diverse user needs. While Reinforcement Learning from …

MaxMin-RLHF: Alignment with diverse human preferences

S Chakraborty, J Qiu, H Yuan, A Koppel… - arxiv preprint arxiv …, 2024 - arxiv.org
Reinforcement Learning from Human Feedback (RLHF) aligns language models to human
preferences by employing a singular reward model derived from preference data. However …

From persona to personalization: A survey on role-playing language agents

J Chen, X Wang, R Xu, S Yuan, Y Zhang, W Shi… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in large language models (LLMs) have significantly boosted the rise
of Role-Playing Language Agents (RPLAs), ie, specialized AI systems designed to simulate …

Aligning to thousands of preferences via system message generalization

S Lee, SH Park, S Kim, M Seo - Advances in Neural …, 2025 - proceedings.neurips.cc
Although humans inherently have diverse values, current large language model (LLM)
alignment methods often assume that aligning LLMs with the general public's preferences is …

Optimization methods for personalizing large language models through retrieval augmentation

A Salemi, S Kallumadi, H Zamani - … of the 47th International ACM SIGIR …, 2024 - dl.acm.org
This paper studies retrieval-augmented approaches for personalizing large language
models (LLMs), which potentially have a substantial impact on various applications and …

Model merging in llms, mllms, and beyond: Methods, theories, applications and opportunities

E Yang, L Shen, G Guo, X Wang, X Cao… - arxiv preprint arxiv …, 2024 - arxiv.org
Model merging is an efficient empowerment technique in the machine learning community
that does not require the collection of raw training data and does not require expensive …