- Academic Search

Y Yang, J Wang - arxiv preprint arxiv:2011.00583, 2020 - arxiv.org

Following the remarkable success of the AlphaGO series, 2019 was a booming year that
witnessed significant advances in multi-agent reinforcement learning (MARL) techniques …

Salva Cita Citato da 351 Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] fransoliehoek.net

[LIBRO][B] A concise introduction to decentralized POMDPs

FA Oliehoek, C Amato - 2016 - Springer

This book presents an overview of formal decision making methods for decentralized
cooperative systems. It is aimed at graduate students and researchers in the fields of …

Salva Cita Citato da 1387 Articoli correlati Tutte e 13 le versioni Ricerca biblioteche

[Free GPT-4]

[PDF] bookfusion.com

[LIBRO][B] Algorithms for reinforcement learning

C Szepesvári - 2022 - books.google.com

Reinforcement learning is a learning paradigm concerned with learning to control a system
so as to maximize a numerical performance measure that expresses a long-term objective …

Salva Cita Citato da 2260 Articoli correlati Tutte e 24 le versioni Ricerca biblioteche

[Free GPT-4]

[PDF] springer.com

Active inference and agency: optimal control without cost functions

K Friston, S Samothrakis, R Montague - Biological cybernetics, 2012 - Springer

This paper describes a variational free-energy formulation of (partially observable) Markov
decision problems in decision making under uncertainty. We show that optimal control can …

Salva Cita Citato da 324 Articoli correlati Tutte e 17 le versioni

[Free GPT-4]

[HTML] nih.gov

Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates.

A Solway, MM Botvinick - Psychological review, 2012 - psycnet.apa.org

Recent work has given rise to the view that reward-based decision making is governed by
two key controllers: a habit system, which stores stimulus–response associations shaped by …

Salva Cita Citato da 258 Articoli correlati Tutte e 14 le versioni

[Free GPT-4]

[PDF] neurips.cc

Variational policy search via trajectory optimization

S Levine, V Koltun - Advances in neural information …, 2013 - proceedings.neurips.cc

In order to learn effective control policies for dynamical systems, policy search methods must
be able to discover successful executions of the desired task. While random exploration can …

Salva Cita Citato da 138 Articoli correlati Tutte e 12 le versioni Versione HTML

[Free GPT-4]

[PDF] arxiv.org

Learning deep neural network policies with continuous memory states

M Zhang, Z McCarthy, C Finn, S Levine… - … on robotics and …, 2016 - ieeexplore.ieee.org

Policy learning for partially observed control tasks requires policies that can remember
salient information from past observations. In this paper, we present a method for learning …

Salva Cita Citato da 106 Articoli correlati Tutte e 7 le versioni

[Free GPT-4]

[PDF] neurips.cc

Program synthesis guided reinforcement learning for partially observed environments

Y Yang, JP Inala, O Bastani, Y Pu… - Advances in neural …, 2021 - proceedings.neurips.cc

A key challenge for reinforcement learning is solving long-horizon planning problems.
Recent work has leveraged programs to guide reinforcement learning in these settings …

Salva Cita Citato da 43 Articoli correlati Tutte e 15 le versioni Versione HTML

[Free GPT-4]

[PDF] psu.edu

[PDF][PDF] Probabilistic inference as a model of planned behavior.

M Toussaint - Künstliche Intell., 2009 - Citeseer

The problem of planning and goal-directed behavior has been addressed in computer
science for many years, typically based on classical concepts like Bellman's optimality …

Salva Cita Citato da 152 Articoli correlati Tutte e 3 le versioni Versione HTML

[Free GPT-4]

[PDF] aaai.org

PUMA: Planning under uncertainty with macro-actions

R He, E Brunskill, N Roy - Proceedings of the AAAI Conference on …, 2010 - ojs.aaai.org

Planning in large, partially observable domains is challenging, especially when a long-
horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that …

Salva Cita Citato da 102 Articoli correlati Tutte e 15 le versioni Versione HTML

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Hierarchical POMDP Controller Optimization by Likelihood Maximization.

An overview of multi-agent reinforcement learning from game theoretical perspective

[LIBRO][B] A concise introduction to decentralized POMDPs

[LIBRO][B] Algorithms for reinforcement learning

Active inference and agency: optimal control without cost functions

Goal-directed decision making as probabilistic inference: a computational framework and potential neural correlates.

Variational policy search via trajectory optimization

Learning deep neural network policies with continuous memory states

Program synthesis guided reinforcement learning for partially observed environments

[PDF][PDF] Probabilistic inference as a model of planned behavior.

PUMA: Planning under uncertainty with macro-actions