Google Académico

J Leike, D Krueger, T Everitt, M Martic, V Maini… - arxiv preprint arxiv …, 2018 - arxiv.org

One obstacle to applying reinforcement learning algorithms to real-world problems is the
lack of suitable reward functions. Designing such reward functions is difficult in part because …

Guardar Citar Citado por 363 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Self-control in cyberspace: Applying dual systems theory to a review of digital self-control tools

U Lyngs, K Lukoff, P Slovak, R Binns, A Slack… - proceedings of the …, 2019 - dl.acm.org

Many people struggle to control their use of digital devices. However, our understanding of
the design mechanisms that support user self-control remains limited. In this paper, we make …

Guardar Citar Citado por 196 Artículos relacionados Las 13 versiones

[Free GPT-4]

[PDF] mlr.press

Off-policy deep reinforcement learning without exploration

S Fujimoto, D Meger, D Precup - … conference on machine …, 2019 - proceedings.mlr.press

Many practical applications of reinforcement learning constrain agents to learn from a fixed
batch of data which has already been gathered, without offering further possibility for data …

Guardar Citar Citado por 1791 Artículos relacionados Las 9 versiones Versión en HTML

[Free GPT-4]

[PDF] mlr.press

Machine theory of mind

N Rabinowitz, F Perbet, F Song… - International …, 2018 - proceedings.mlr.press

Abstract Theory of mind (ToM) broadly refers to humans' ability to represent the mental
states of others, including their desires, beliefs, and intentions. We design a Theory of Mind …

Guardar Citar Citado por 682 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]

[PDF] datascienceassn.org

Concrete problems in AI safety

D Amodei, C Olah, J Steinhardt, P Christiano… - arxiv preprint arxiv …, 2016 - arxiv.org

Rapid progress in machine learning and artificial intelligence (AI) has brought increasing
attention to the potential impacts of AI technologies on society. In this paper we discuss one …

Guardar Citar Citado por 3130 Artículos relacionados Las 9 versiones Versión en HTML

[Free GPT-4]

[PDF] neurips.cc

Inverse reward design

D Hadfield-Menell, S Milli, P Abbeel… - Advances in neural …, 2017 - proceedings.neurips.cc

Autonomous agents optimize the reward function we give them. What they don't know is how
hard it is for us to design a reward function that actually captures what we want. When …

Guardar Citar Citado por 509 Artículos relacionados Las 19 versiones Versión en HTML

[Free GPT-4]

[PDF] royalsocietypublishing.org

Emotion prediction as computation over a generative theory of mind

SD Houlihan, M Kleiman-Weiner… - … of the Royal …, 2023 - royalsocietypublishing.org

From sparse descriptions of events, observers can make systematic and nuanced
predictions of what emotions the people involved will experience. We propose a formal …

Guardar Citar Citado por 33 Artículos relacionados Las 13 versiones

[Free GPT-4]

[PDF] neurips.cc

Online bayesian goal inference for boundedly rational planning agents

T Zhi-Xuan, J Mann, T Silver… - Advances in neural …, 2020 - proceedings.neurips.cc

People routinely infer the goals of others by observing their actions over time. Remarkably,
we can do so even when those actions lead to failure, enabling us to assist others when we …

Guardar Citar Citado por 113 Artículos relacionados Las 6 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

AGI safety literature review

T Everitt, G Lea, M Hutter - arxiv preprint arxiv:1805.01109, 2018 - arxiv.org

The development of Artificial General Intelligence (AGI) promises to be a major event. Along
with its many potential benefits, it also raises serious safety concerns (Bostrom, 2014). The …

Guardar Citar Citado por 158 Artículos relacionados Las 8 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

When humans aren't optimal: Robots that collaborate with risk-aware humans

M Kwon, E Biyik, A Talati, K Bhasin, DP Losey… - Proceedings of the …, 2020 - dl.acm.org

In order to collaborate safely and efficiently, robots need to anticipate how their human
partners will behave. Some of today's robots model humans as if they were also robots, and …

Guardar Citar Citado por 111 Artículos relacionados Las 11 versiones

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Learning the preferences of ignorant, inconsistent agents

Scalable agent alignment via reward modeling: a research direction

Self-control in cyberspace: Applying dual systems theory to a review of digital self-control tools

Off-policy deep reinforcement learning without exploration

Machine theory of mind

Concrete problems in AI safety

Inverse reward design

Emotion prediction as computation over a generative theory of mind

Online bayesian goal inference for boundedly rational planning agents

AGI safety literature review

When humans aren't optimal: Robots that collaborate with risk-aware humans