AI generates covertly racist decisions about people based on their dialect

V Hofmann, PR Kalluri, D Jurafsky, S King - Nature, 2024 - nature.com
Hundreds of millions of people now interact with language models, with uses ranging from
help with writing, to informing hiring decisions. However, these language models are known …

[PDF][PDF] Machine psychology: Investigating emergent capabilities and behavior in large language models using psychological methods

T Hagendorff - arxiv preprint arxiv:2303.13988, 2023 - cybershafarat.com
Large language models (LLMs) are currently at the forefront of intertwining AI systems with
human communication and everyday life. Due to rapid technological advances and their …

Xstest: A test suite for identifying exaggerated safety behaviours in large language models

P Röttger, HR Kirk, B Vidgen, G Attanasio… - arxiv preprint arxiv …, 2023 - arxiv.org
Without proper safeguards, large language models will readily follow malicious instructions
and generate toxic content. This risk motivates safety efforts such as red-teaming and large …

Quantifying ai psychology: A psychometrics benchmark for large language models

Y Li, Y Huang, H Wang, X Zhang, J Zou… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have demonstrated exceptional task-solving capabilities,
increasingly adopting roles akin to human-like assistants. The broader integration of LLMs …

Are Large Language Models Consistent over Value-laden Questions?

J Moore, T Deshpande, D Yang - arxiv preprint arxiv:2407.02996, 2024 - arxiv.org
Large language models (LLMs) appear to bias their survey answers toward certain values.
Nonetheless, some argue that LLMs are too inconsistent to simulate particular values. Are …

A Survey on Responsible LLMs: Inherent Risk, Malicious Use, and Mitigation Strategy

H Wang, W Fu, Y Tang, Z Chen, Y Huang… - arxiv preprint arxiv …, 2025 - arxiv.org
While large language models (LLMs) present significant potential for supporting numerous
real-world applications and delivering positive social impacts, they still face significant …

Hidden Persuaders: LLMs' Political Leaning and Their Influence on Voters

Y Potter, S Lai, J Kim, J Evans, D Song - arxiv preprint arxiv:2410.24190, 2024 - arxiv.org
How could LLMs influence our democracy? We investigate LLMs' political leanings and the
potential influence of LLMs on voters by conducting multiple experiments in a US …

Biased ai can influence political decision-making

J Fisher, S Feng, R Aron, T Richardson, Y Choi… - arxiv preprint arxiv …, 2024 - arxiv.org
As modern AI models become integral to everyday tasks, concerns about their inherent
biases and their potential impact on human decision-making have emerged. While bias in …

Revealing fine-grained values and opinions in large language models

D Wright, A Arora, N Borenstein, S Yadav… - arxiv preprint arxiv …, 2024 - arxiv.org
Uncovering latent values and opinions embedded in large language models (LLMs) can
help identify biases and mitigate potential harm. Recently, this has been approached by …

Benchmark suites instead of leaderboards for evaluating AI fairness

A Wang, A Hertzmann, O Russakovsky - Patterns, 2024 - cell.com
Benchmarks and leaderboards are commonly used to track the fairness impacts of artificial
intelligence (AI) models. Many critics argue against this practice, since it incentivizes …