Waitgpt: Monitoring and steering conversational llm agent in data analysis with on-the-fly code visualization

L **e, C Zheng, H **a, H Qu, C Zhu-Tian - Proceedings of the 37th …, 2024 - dl.acm.org
Large language models (LLMs) support data analysis through conversational user
interfaces, as exemplified in OpenAI's ChatGPT (formally known as Advanced Data Analysis …

Ai safety in generative ai large language models: A survey

J Chua, Y Li, S Yang, C Wang, L Yao - arxiv preprint arxiv:2407.18369, 2024 - arxiv.org
Large Language Model (LLMs) such as ChatGPT that exhibit generative AI capabilities are
facing accelerated adoption and innovation. The increased presence of Generative AI (GAI) …

Ethical AI Governance: Methods for Evaluating Trustworthy AI

L McCormack, M Bendechache - arxiv preprint arxiv:2409.07473, 2024 - arxiv.org
Trustworthy Artificial Intelligence (TAI) integrates ethics that align with human values, looking
at their influence on AI behaviour and decision-making. Primarily dependent on self …

Valuecompass: A framework of fundamental values for human-ai alignment

H Shen, T Knearem, R Ghosh, YJ Yang, T Mitra… - arxiv preprint arxiv …, 2024 - arxiv.org
As AI systems become more advanced, ensuring their alignment with a diverse range of
individuals and societal values becomes increasingly critical. But how can we capture …

Empirical Impacts of Independent and Collaborative Training on Task Performance and Improvement in Human-AI Teams

C Flathmann, BG Schelble… - Proceedings of the …, 2024 - journals.sagepub.com
With improving AI technology, human-AI teams are becoming increasingly common in
research. Within these teams, humans and AI can work collaboratively to complete shared …

SafetyAnalyst: Interpretable, transparent, and steerable LLM safety moderation

JJ Li, V Pyatkin, M Kleiman-Weiner, L Jiang… - arxiv preprint arxiv …, 2024 - arxiv.org
The ideal LLM content moderation system would be both structurally interpretable (so its
decisions can be explained to users) and steerable (to reflect a community's values or align …

Framework for human–XAI symbiosis: extended self from the dual-process theory perspective

Y Litvinova, P Mikalef, X Luo - Journal of Business Analytics, 2024 - Taylor & Francis
The use of artificial intelligence (AI)-based decision support systems (DSSs) is expected to
enable superior human–XAI performance. To enhance decision-making performance …

Why human-AI relationships need socioaffective alignment

HR Kirk, I Gabriel, C Summerfield, B Vidgen… - arxiv preprint arxiv …, 2025 - arxiv.org
Humans strive to design safe AI systems that align with our goals and remain under our
control. However, as AI capabilities advance, we face a new challenge: the emergence of …

C3AI: Crafting and Evaluating Constitutions for Constitutional AI

Y Kyrychenko, K Zhou, E Bogucka… - arxiv preprint arxiv …, 2025 - arxiv.org
Constitutional AI (CAI) guides LLM behavior using constitutions, but identifying which
principles are most effective for model alignment remains an open challenge. We introduce …

To Rely or Not to Rely? Evaluating Interventions for Appropriate Reliance on Large Language Models

JY Bo, S Wan, A Anderson - arxiv preprint arxiv:2412.15584, 2024 - arxiv.org
As Large Language Models become integral to decision-making, optimism about their
power is tempered with concern over their errors. Users may over-rely on LLM advice that is …