Feedback-based tree search for reinforcement learning

D Jiang, E Ekwedike, H Liu - International conference on …, 2018‏ - proceedings.mlr.press
Inspired by recent successes of Monte-Carlo tree search (MCTS) in a number of artificial
intelligence (AI) application domains, we propose a reinforcement learning (RL) technique …

An approximately optimal relative value learning algorithm for averaged MDPs with continuous states and actions

H Sharma, R Jain - 2019 57th Annual Allerton Conference on …, 2019‏ - ieeexplore.ieee.org
It has long been a challenging problem to design algorithms for Markov decision processes
(MDPs) with continuous states and actions that are provably approximately optimal and can …

Empirical algorithms for general stochastic systems with continuous states and actions

H Sharma, R Jain, W Haskell - 2019 IEEE 58th Conference on …, 2019‏ - ieeexplore.ieee.org
In this paper, we present Randomized Empirical Value Learning (RAEVL) algorithm for
MDPs with continuous state and action spaces. This algorithm combines the ideas of …