Whose opinions do language models reflect?
Abstract Language models (LMs) are increasingly being used in open-ended contexts,
where the opinions they reflect in response to subjective queries can have a profound …
where the opinions they reflect in response to subjective queries can have a profound …
Holistic evaluation of language models
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …
technologies, but their capabilities, limitations, and risks are not well understood. We present …
Understanding Practices, Challenges, and Opportunities for User-Engaged Algorithm Auditing in Industry Practice
Recent years have seen growing interest among both researchers and practitioners in user-
engaged approaches to algorithm auditing, which directly engage users in detecting …
engaged approaches to algorithm auditing, which directly engage users in detecting …
Open problems and fundamental limitations of reinforcement learning from human feedback
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …
to align with human goals. RLHF has emerged as the central method used to finetune state …
Towards measuring the representation of subjective global opinions in language models
Large language models (LLMs) may not equitably represent diverse global perspectives on
societal issues. In this paper, we develop a quantitative framework to evaluate whose …
societal issues. In this paper, we develop a quantitative framework to evaluate whose …
Bridging the gap: A survey on integrating (human) feedback for natural language generation
Natural language generation has witnessed significant advancements due to the training of
large language models on vast internet-scale datasets. Despite these advancements, there …
large language models on vast internet-scale datasets. Despite these advancements, there …
The'Problem'of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation
Human variation in labeling is often considered noise. Annotation projects for machine
learning (ML) aim at minimizing human label variation, with the assumption to maximize …
learning (ML) aim at minimizing human label variation, with the assumption to maximize …
Lamp: When large language models meet personalization
This paper highlights the importance of personalization in large language models and
introduces the LaMP benchmark--a novel benchmark for training and evaluating language …
introduces the LaMP benchmark--a novel benchmark for training and evaluating language …
Toward a perspectivist turn in ground truthing for predictive computing
Abstract Most current Artificial Intelligence applications are based on supervised Machine
Learning (ML), which ultimately grounds on data annotated by small teams of experts or …
Learning (ML), which ultimately grounds on data annotated by small teams of experts or …
Working with AI to persuade: Examining a large language model's ability to generate pro-vaccination messages
Artificial Intelligence (AI) is a transformative force in communication and messaging strategy,
with potential to disrupt traditional approaches. Large language models (LLMs), a form of AI …
with potential to disrupt traditional approaches. Large language models (LLMs), a form of AI …