Beyond preferences in ai alignment

T Zhi-Xuan, M Carroll, M Franklin, H Ashton - Philosophical Studies, 2024 - Springer
The dominant practice of AI alignment assumes (1) that preferences are an adequate
representation of human values,(2) that human rationality can be understood in terms of …

Optimal policies tend to seek power

AM Turner, L Smith, R Shah, A Critch… - arxiv preprint arxiv …, 2019 - arxiv.org
Some researchers speculate that intelligent reinforcement learning (RL) agents would be
incentivized to seek resources and power in pursuit of their objectives. Other researchers …

To the noise and back: Diffusion for shared autonomy

T Yoneda, L Sun, B Stadie, M Walter - arxiv preprint arxiv:2302.12244, 2023 - arxiv.org
Shared autonomy is an operational concept in which a user and an autonomous agent
collaboratively control a robotic system. It provides a number of advantages over the …

Warmth and competence in human-agent cooperation

KR McKee, X Bai, ST Fiske - Autonomous Agents and Multi-Agent Systems, 2024 - Springer
Interaction and cooperation with humans are overarching aspirations of artificial intelligence
research. Recent studies demonstrate that AI agents trained with deep reinforcement …

First contact: Unsupervised human-machine co-adaptation via mutual information maximization

S Reddy, S Levine, A Dragan - Advances in Neural …, 2022 - proceedings.neurips.cc
How can we train an assistive human-machine interface (eg, an electromyography-based
limb prosthesis) to translate a user's raw command signals into the actions of a robot or …

Learning to assist humans without inferring rewards

V Myers, E Ellis, S Levine, B Eysenbach… - arxiv preprint arxiv …, 2024 - arxiv.org
Assistive agents should make humans' lives easier. Classically, such assistance is studied
through the lens of inverse reinforcement learning, where an assistive agent (eg, a chatbot …

Asha: Assistive teleoperation via human-in-the-loop reinforcement learning

S Chen, J Gao, S Reddy, G Berseth… - … on Robotics and …, 2022 - ieeexplore.ieee.org
Building assistive interfaces for controlling robots through arbitrary, high-dimensional, noisy
inputs (eg, webcam images of eye gaze) can be challenging, especially when it involves …

[PDF][PDF] Be considerate: Avoiding negative side effects in reinforcement learning

P Alizadeh Alamdari, TQ Klassen… - Proceedings of the …, 2022 - cs.toronto.edu
Sequential decision making, whether it is realized via reinforcement learning (RL),
supervised learning, or some form of probabilistic or otherwise symbolic planning using …

Human participants in AI research: Ethics and transparency in practice

KR McKee - IEEE Transactions on Technology and Society, 2024 - ieeexplore.ieee.org
In recent years, research involving human participants has been critical to advances in
artificial intelligence (AI) and machine learning (ML), particularly in the areas of …

SARI: Shared autonomy across repeated interaction

A Jonnavittula, SA Mehta, DP Losey - ACM Transactions on Human …, 2024 - dl.acm.org
Assistive robot arms try to help their users perform everyday tasks. One way robots can
provide this assistance is shared autonomy. Within shared autonomy, both the human and …