Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024 - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

Implicit bias in large language models: Experimental proof and implications for education

M Warr, NJ Oster, R Isaac - Journal of Research on Technology in …, 2024 - Taylor & Francis
We provide experimental evidence of implicit racial bias in a large language model
(specifically ChatGPT 3.5) in the context of an educational task and discuss implications for …

[PDF][PDF] Is “A Helpful Assistant” the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts

M Zheng, J Pei, D Jurgens - arxiv preprint arxiv:2311.10054, 2023 - academia.edu
Prompting serves as the major way humans interact with Large Language Models (LLM).
Commercial AI systems commonly define the role of the LLM in system prompts. For …

Impact of an adaptive dialog that uses natural language processing to detect students' ideas and guide knowledge integration.

L Gerard, M Holtman, B Riordan… - Journal of Educational …, 2024 - psycnet.apa.org
This study leverages natural language processing (NLP) to deepen our understanding of
how students integrate their ideas about genetic inheritance while engaging in an adaptive …

Human and LLM Biases in Hate Speech Annotations: A Socio-Demographic Analysis of Annotators and Targets

T Giorgi, L Cima, T Fagni, M Avvenuti… - arxiv preprint arxiv …, 2024 - arxiv.org
The rise of online platforms exacerbated the spread of hate speech, demanding scalable
and effective detection. However, the accuracy of hate speech detection systems heavily …

Large language models cannot replace human participants because they cannot portray identity groups

A Wang, J Morgenstern, JP Dickerson - arxiv preprint arxiv:2402.01908, 2024 - arxiv.org
Large language models (LLMs) are increasing in capability and popularity, propelling their
application in new domains--including as replacements for human participants in …

LLMs generate structurally realistic social networks but overestimate political homophily

S Chang, A Chaszczewicz, E Wang… - arxiv preprint arxiv …, 2024 - arxiv.org
Generating social networks is essential for many applications, such as epidemic modeling
and social simulations. Prior approaches either involve deep learning models, which require …

From persona to personalization: A survey on role-playing language agents

J Chen, X Wang, R Xu, S Yuan, Y Zhang, W Shi… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent advancements in large language models (LLMs) have significantly boosted the rise
of Role-Playing Language Agents (RPLAs), ie, specialized AI systems designed to simulate …

Bias and toxicity in role-play reasoning

J Zhao, Z Qian, L Cao, Y Wang, Y Ding - arxiv preprint arxiv:2409.13979, 2024 - arxiv.org
Role-play in the Large Language Model (LLM) is a crucial technique that enables models to
adopt specific perspectives, enhancing their ability to generate contextually relevant and …

Personagym: Evaluating persona agents and llms

V Samuel, HP Zou, Y Zhou, S Chaudhari… - arxiv preprint arxiv …, 2024 - arxiv.org
Persona agents, which are LLM agents that act according to an assigned persona, have
demonstrated impressive contextual response capabilities across various applications …