Google 학술 검색

X Guo, S Singh, R Lewis, H Lee - arxiv preprint arxiv:1604.07095, 2016 - arxiv.org

Monte Carlo Tree Search (MCTS) methods have proven powerful in planning for sequential
decision-making problems such as Go and video games, but their performance can be poor …

저장 인용 60회 인용 관련 학술자료 전체 8개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

Bandit algorithms in information retrieval

D Glowacka - Foundations and Trends® in Information …, 2019 - nowpublishers.com

Bandit algorithms, named after casino slot machines sometimes known as “one-armed
bandits”, fall into a broad category of stochastic scheduling problems. In the setting with …

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Memory-augmented monte carlo tree search

C **ao, J Mei, M Müller - Proceedings of the AAAI conference on …, 2018 - ojs.aaai.org

This paper proposes and evaluates Memory-Augmented Monte Carlo Tree Search (M-
MCTS), which provides a new approach to exploit generalization in online real-time search …

저장 인용 26회 인용 관련 학술자료 전체 10개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Multiple queries as bandit arms

C Li, P Resnick, Q Mei - Proceedings of the 25th ACM International on …, 2016 - dl.acm.org

Existing retrieval systems rely on a single active query to pull documents from the index.
Relevance feedback may be used to iteratively refine the query, but only one query is active …

저장 인용 22회 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Oga-uct: On-the-go abstractions in uct

A Anand, R Noothigattu, P Singla - Proceedings of the International …, 2016 - ojs.aaai.org

Recent work has begun exploring the value of domain abstractions in Monte-Carlo Tree
Search (MCTS) algorithms for probabilistic planning. These algorithms automatically …

저장 인용 17회 인용 관련 학술자료 전체 3개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A Unified Manifold Similarity Measure Enhancing Few-Shot, Transfer, and Reinforcement Learning in Manifold-Distributed Datasets

SW Qayyumi, LF Park, O Obst - arxiv preprint arxiv:2408.07095, 2024 - arxiv.org

Training a classifier with high mean accuracy from a manifold-distributed dataset can be
challenging. This problem is compounded further when there are only few labels available …

저장 인용 관련 학술자료 전체 2개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] umich.edu

Deep learning and reward design for reinforcement learning

X Guo - 2017 - deepblue.lib.umich.edu

One of the fundamental problems in Artificial Intelligence is sequential decision making in a
flexible environment. Reinforcement Learning (RL) gives a set of tools for solving sequential …

저장 인용 8회 인용 관련 학술자료 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] ualberta.ca

[PDF][PDF] Advances in Simulation-Based Search and Batch Reinforcement Learning

C **ao - 2023 - era.library.ualberta.ca

Reinforcement learning (RL) defines a general computational problem where the learner
must learn to make good decisions through interactive experience. To be effective in solving …

저장 인용 관련 학술자료 HTML 버전

Performance of the RDDL planners

D Rao, Z Jiang, Y Wen, J Li - 2016 IEEE International …, 2016 - ieeexplore.ieee.org

Recently, a new language is proposed to model probabilistic planning. It is the Relational
Dynamic Influence Diagram Language (RDDL). There are a few domains, which are …

저장 인용 1회 인용 관련 학술자료

[Free GPT-4]
[DeepSeek]

[PDF] icaps-conference.org

[PDF][PDF] Dissertation Abstract: Exploiting Symmetries in Sequential Decision Making under Uncertainty

A Anand - The 26th International Conference on …, 2016 - icaps16.icaps-conference.org

The problem of sequential decision making under uncertainty, often modeled as an MDP is
an important problem in planning and reinforcement learning communities. Traditional MDP …

저장 인용 관련 학술자료 전체 3개의 버전 HTML 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Improving exploration in UCT using local manifolds

Deep learning for reward design to improve monte carlo tree search in atari games

Bandit algorithms in information retrieval

Memory-augmented monte carlo tree search

Multiple queries as bandit arms

Oga-uct: On-the-go abstractions in uct

A Unified Manifold Similarity Measure Enhancing Few-Shot, Transfer, and Reinforcement Learning in Manifold-Distributed Datasets

Deep learning and reward design for reinforcement learning

[PDF][PDF] Advances in Simulation-Based Search and Batch Reinforcement Learning

Performance of the RDDL planners

[PDF][PDF] Dissertation Abstract: Exploiting Symmetries in Sequential Decision Making under Uncertainty