Kernel mean embedding of distributions: A review and beyond

K Muandet, K Fukumizu… - … and Trends® in …, 2017 - nowpublishers.com
A Hilbert space embedding of a distribution—in short, a kernel mean embedding—has
recently emerged as a powerful tool for machine learning and statistical inference. The basic …

The algorithmic anatomy of model-based evaluation

ND Daw, P Dayan - … Transactions of the Royal Society B …, 2014 - royalsocietypublishing.org
Despite many debates in the first half of the twentieth century, it is now largely a truism that
humans and other animals build models of their environments and use them for prediction …

Bayesian reinforcement learning: A survey

M Ghavamzadeh, S Mannor, J Pineau… - … and Trends® in …, 2015 - nowpublishers.com
Bayesian methods for machine learning have been widely investigated, yielding principled
methods for incorporating prior information into inference algorithms. In this survey, we …

Gaussian processes for data-efficient learning in robotics and control

MP Deisenroth, D Fox… - IEEE transactions on …, 2013 - ieeexplore.ieee.org
Autonomous learning has been a promising direction in control and robotics for more than a
decade since data-driven learning allows to reduce the amount of engineering knowledge …

[PDF][PDF] PILCO: A model-based and data-efficient approach to policy search

M Deisenroth, CE Rasmussen - Proceedings of the 28th …, 2011 - aiweb.cs.washington.edu
In this paper, we introduce pilco, a practical, data-efficient model-based policy search
method. Pilco reduces model bias, one of the key problems of model-based reinforcement …

Novelty or surprise?

A Barto, M Mirolli, G Baldassarre - Frontiers in psychology, 2013 - frontiersin.org
Novelty and surprise play significant roles in animal behavior and in attempts to understand
the neural mechanisms underlying it. They also play important roles in technology, where …

[BOG][B] Reinforcement learning and dynamic programming using function approximators

L Busoniu, R Babuska, B De Schutter, D Ernst - 2017 - taylorfrancis.com
From household appliances to applications in robotics, engineered systems involving
complex dynamics can only be as effective as the algorithms that control them. While …

Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control

ND Daw, Y Niv, P Dayan - Nature neuroscience, 2005 - nature.com
A broad range of neural and behavioral data suggests that the brain contains multiple
systems for behavioral choice, including one associated with prefrontal cortex and another …

Gaussian process dynamical models for human motion

JM Wang, DJ Fleet, A Hertzmann - IEEE transactions on pattern …, 2007 - ieeexplore.ieee.org
We introduce Gaussian process dynamical models (GPDMs) for nonlinear time series
analysis, with applications to learning models of human pose and motion from high …

Efficient exploration through bayesian deep q-networks

K Azizzadenesheli, E Brunskill… - 2018 Information …, 2018 - ieeexplore.ieee.org
We propose Bayesian Deep Q-Network (BDQN), a practical Thompson sampling based
Reinforcement Learning (RL) Algorithm. Thompson sampling allows for targeted exploration …