Beyond preferences in ai alignment
The dominant practice of AI alignment assumes (1) that preferences are an adequate
representation of human values,(2) that human rationality can be understood in terms of …
representation of human values,(2) that human rationality can be understood in terms of …
The Moral Case for Using Language Model Agents for Recommendation
Our information and communication environment has fallen short of the ideals that
networked global communication might have served. Identifying all the causes of its …
networked global communication might have served. Identifying all the causes of its …
Evaluating the cybersecurity robustness of commercial llms against adversarial prompts: A promptbench analysis
T Goto, K Ono, A Morita - Authorea Preprints, 2024 - techrxiv.org
This study presents a comprehensive evaluation of the cybersecurity robustness of five
leading Large Language Models (LLMs)-ChatGPT-4, Google Gemini, Anthropic Claude …
leading Large Language Models (LLMs)-ChatGPT-4, Google Gemini, Anthropic Claude …
GPT-ology, Computational Models, Silicon Sampling: How should we think about LLMs in Cognitive Science?
DC Ong - arxiv preprint arxiv:2406.09464, 2024 - arxiv.org
Large Language Models have taken the cognitive science world by storm. It is perhaps
timely now to take stock of the various research paradigms that have been used to make …
timely now to take stock of the various research paradigms that have been used to make …
[PDF][PDF] Value as semantics: Representations of human moral and hedonic value in large language models
A Leshinskaya, C San Franscisco… - … 2023 Workshop: AI …, 2023 - ai.objectives.institute
Aligning AI with human objectives can be facilitated by enabling it to learn and veridically
represent our values. In modern AI agents, value is a scalar magnitude reflecting the …
represent our values. In modern AI agents, value is a scalar magnitude reflecting the …
A Comparative Analysis of Human and Machine Translation Quality
C Marshall - 2024 - scholarsarchive.byu.edu
A common question raised by both translators and Machine Translation developers is Will
Machine Translation (MT) ever attain the level of Human Translation (HT) quality …
Machine Translation (MT) ever attain the level of Human Translation (HT) quality …
Contractual AI: Toward More Aligned, Transparent, and Robust Dialogue Agents
We present a new framework for AI alignment called Contractual AI, and apply it to the
setting of dialogue agents chatting with humans. This framework incorporates and builds on …
setting of dialogue agents chatting with humans. This framework incorporates and builds on …