Deep learning for reward design to improve monte carlo tree search in atari games
Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential
decision-making problems such as Go and video games, but their performance can be poor …
decision-making problems such as Go and video games, but their performance can be poor …
Bandit algorithms in information retrieval
D Glowacka - Foundations and Trends® in Information …, 2019 - nowpublishers.com
Bandit algorithms, named after casino slot machines sometimes known as “one-armed
bandits”, fall into a broad category of stochastic scheduling problems. In the setting with …
bandits”, fall into a broad category of stochastic scheduling problems. In the setting with …
Memory-augmented monte carlo tree search
This paper proposes and evaluates Memory-Augmented Monte Carlo Tree Search (M-
MCTS), which provides a new approach to exploit generalization in online real-time search …
MCTS), which provides a new approach to exploit generalization in online real-time search …
Multiple queries as bandit arms
Existing retrieval systems rely on a single active query to pull documents from the index.
Relevance feedback may be used to iteratively refine the query, but only one query is active …
Relevance feedback may be used to iteratively refine the query, but only one query is active …
Oga-uct: On-the-go abstractions in uct
Recent work has begun exploring the value of domain abstractions in Monte-Carlo Tree
Search (MCTS) algorithms for probabilistic planning. These algorithms automatically …
Search (MCTS) algorithms for probabilistic planning. These algorithms automatically …
A Unified Manifold Similarity Measure Enhancing Few-Shot, Transfer, and Reinforcement Learning in Manifold-Distributed Datasets
Training a classifier with high mean accuracy from a manifold-distributed dataset can be
challenging. This problem is compounded further when there are only few labels available …
challenging. This problem is compounded further when there are only few labels available …
Deep learning and reward design for reinforcement learning
X Guo - 2017 - deepblue.lib.umich.edu
One of the fundamental problems in Artificial Intelligence is sequential decision making in a
flexible environment. Reinforcement Learning (RL) gives a set of tools for solving sequential …
flexible environment. Reinforcement Learning (RL) gives a set of tools for solving sequential …
[PDF][PDF] Advances in Simulation-Based Search and Batch Reinforcement Learning
C **ao - 2023 - era.library.ualberta.ca
Reinforcement learning (RL) defines a general computational problem where the learner
must learn to make good decisions through interactive experience. To be effective in solving …
must learn to make good decisions through interactive experience. To be effective in solving …
Performance of the RDDL planners
Recently, a new language is proposed to model probabilistic planning. It is the Relational
Dynamic Influence Diagram Language (RDDL). There are a few domains, which are …
Dynamic Influence Diagram Language (RDDL). There are a few domains, which are …
[PDF][PDF] Dissertation Abstract: Exploiting Symmetries in Sequential Decision Making under Uncertainty
A Anand - The 26th International Conference on …, 2016 - icaps16.icaps-conference.org
The problem of sequential decision making under uncertainty, often modeled as an MDP is
an important problem in planning and reinforcement learning communities. Traditional MDP …
an important problem in planning and reinforcement learning communities. Traditional MDP …