Proactive conversational agents in the post-chatgpt world

L Liao, GH Yang, C Shah - Proceedings of the 46th International ACM …, 2023‏ - dl.acm.org
ChatGPT and similar large language model (LLM) based conversational agents have
brought shock waves to the research world. Although astonished by their human-like …

Safetyprompts: a systematic review of open datasets for evaluating and improving large language model safety

P Röttger, F Pernisi, B Vidgen, D Hovy - arxiv preprint arxiv:2404.05399, 2024‏ - arxiv.org
The last two years have seen a rapid growth in concerns around the safety of large
language models (LLMs). Researchers and practitioners have met these concerns by …

Xstest: A test suite for identifying exaggerated safety behaviours in large language models

P Röttger, HR Kirk, B Vidgen, G Attanasio… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Without proper safeguards, large language models will readily follow malicious instructions
and generate toxic content. This risk motivates safety efforts such as red-teaming and large …

" I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust

SSY Kim, QV Liao, M Vorvoreanu, S Ballard… - Proceedings of the …, 2024‏ - dl.acm.org
Widely deployed large language models (LLMs) can produce convincing yet incorrect
outputs, potentially misleading users who may rely on them as if they were correct. To …

Prosocialdialog: A prosocial backbone for conversational agents

H Kim, Y Yu, L Jiang, X Lu, D Khashabi, G Kim… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Most existing dialogue systems fail to respond properly to potentially unsafe user utterances
by either ignoring or passively agreeing with them. To address this issue, we introduce …

ROBBIE: Robust bias evaluation of large generative language models

D Esiobu, X Tan, S Hosseini, M Ung, Y Zhang… - arxiv preprint arxiv …, 2023‏ - arxiv.org
As generative large language models (LLMs) grow more performant and prevalent, we must
develop comprehensive enough tools to measure and improve their fairness. Different …

Mirages: On anthropomorphism in dialogue systems

G Abercrombie, AC Curry, T Dinkar, V Rieser… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Automated dialogue or conversational systems are anthropomorphised by developers and
personified by users. While a degree of anthropomorphism may be inevitable due to the …

The ethics of advanced ai assistants

I Gabriel, A Manzini, G Keeling, LA Hendricks… - arxiv preprint arxiv …, 2024‏ - arxiv.org
This paper focuses on the opportunities and the ethical and societal risks posed by
advanced AI assistants. We define advanced AI assistants as artificial agents with natural …

Dices dataset: Diversity in conversational ai evaluation for safety

L Aroyo, A Taylor, M Diaz, C Homan… - Advances in …, 2023‏ - proceedings.neurips.cc
Abstract Machine learning approaches often require training and evaluation datasets with a
clear separation between positive and negative examples. This requirement overly …

Gaining wisdom from setbacks: Aligning large language models via mistake analysis

K Chen, C Wang, K Yang, J Han, L Hong, F Mi… - arxiv preprint arxiv …, 2023‏ - arxiv.org
The rapid development of large language models (LLMs) has not only provided numerous
opportunities but also presented significant challenges. This becomes particularly evident …