Deep learning for reward design to improve monte carlo tree search in atari games

X Guo, S Singh, R Lewis, H Lee - arxiv preprint arxiv:1604.07095, 2016 - arxiv.org
Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential
decision-making problems such as Go and video games, but their performance can be poor …

Bandit algorithms in information retrieval

D Glowacka - Foundations and Trends® in Information …, 2019 - nowpublishers.com
Bandit algorithms, named after casino slot machines sometimes known as “one-armed
bandits”, fall into a broad category of stochastic scheduling problems. In the setting with …

Memory-augmented monte carlo tree search

C **ao, J Mei, M Müller - Proceedings of the AAAI conference on …, 2018 - ojs.aaai.org
This paper proposes and evaluates Memory-Augmented Monte Carlo Tree Search (M-
MCTS), which provides a new approach to exploit generalization in online real-time search …

Multiple queries as bandit arms

C Li, P Resnick, Q Mei - Proceedings of the 25th ACM International on …, 2016 - dl.acm.org
Existing retrieval systems rely on a single active query to pull documents from the index.
Relevance feedback may be used to iteratively refine the query, but only one query is active …

Oga-uct: On-the-go abstractions in uct

A Anand, R Noothigattu, P Singla - Proceedings of the International …, 2016 - ojs.aaai.org
Recent work has begun exploring the value of domain abstractions in Monte-Carlo Tree
Search (MCTS) algorithms for probabilistic planning. These algorithms automatically …

A Unified Manifold Similarity Measure Enhancing Few-Shot, Transfer, and Reinforcement Learning in Manifold-Distributed Datasets

SW Qayyumi, LF Park, O Obst - arxiv preprint arxiv:2408.07095, 2024 - arxiv.org
Training a classifier with high mean accuracy from a manifold-distributed dataset can be
challenging. This problem is compounded further when there are only few labels available …

Deep learning and reward design for reinforcement learning

X Guo - 2017 - deepblue.lib.umich.edu
One of the fundamental problems in Artificial Intelligence is sequential decision making in a
flexible environment. Reinforcement Learning (RL) gives a set of tools for solving sequential …

[PDF][PDF] Advances in Simulation-Based Search and Batch Reinforcement Learning

C **ao - 2023 - era.library.ualberta.ca
Reinforcement learning (RL) defines a general computational problem where the learner
must learn to make good decisions through interactive experience. To be effective in solving …

Performance of the RDDL planners

D Rao, Z Jiang, Y Wen, J Li - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
Recently, a new language is proposed to model probabilistic planning. It is the Relational
Dynamic Influence Diagram Language (RDDL). There are a few domains, which are …

[PDF][PDF] Dissertation Abstract: Exploiting Symmetries in Sequential Decision Making under Uncertainty

A Anand - The 26th International Conference on …, 2016 - icaps16.icaps-conference.org
The problem of sequential decision making under uncertainty, often modeled as an MDP is
an important problem in planning and reinforcement learning communities. Traditional MDP …