Google 학술 검색

M Świechowski, K Godlewski, B Sawicki… - Artificial Intelligence …, 2023 - Springer

Abstract Monte Carlo Tree Search (MCTS) is a powerful approach to designing game-
playing bots or solving sequential decision problems. The method relies on intelligent tree …

저장 인용 325회 인용 관련 학술자료 전체 12개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] nowpublishers.com

A tutorial on thompson sampling

DJ Russo, B Van Roy, A Kazerouni… - … and Trends® in …, 2018 - nowpublishers.com

Thompson sampling is an algorithm for online decision problems where actions are taken
sequentially in a manner that must balance between exploiting what is known to maximize …

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Learning to optimize via information-directed sampling

D Russo, B Van Roy - Advances in neural information …, 2014 - proceedings.neurips.cc

We propose information-directed sampling--a new algorithm for online optimization
problems in which a decision-maker must balance between exploration and exploitation …

[Free GPT-4]
[DeepSeek]

[PDF] informs.org

Learning to optimize via information-directed sampling

D Russo, B Van Roy - Operations Research, 2018 - pubsonline.informs.org

We propose information-directed sampling—a new approach to online optimization
problems in which a decision maker must balance between exploration and exploitation …

저장 인용 154회 인용 관련 학술자료 전체 6개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] aaai.org

Adaptive anytime multi-agent path finding using bandit-based large neighborhood search

T Phan, T Huang, B Dilkina, S Koenig - Proceedings of the AAAI …, 2024 - ojs.aaai.org

Anytime multi-agent path finding (MAPF) is a promising approach to scalable path
optimization in large-scale multi-agent systems. State-of-the-art anytime MAPF is based on …

저장 인용 11회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Online learning of decision trees with thompson sampling

A Chaouki, J Read, A Bifet - International Conference on …, 2024 - proceedings.mlr.press

Decision Trees are prominent prediction models for interpretable Machine Learning. They
have been thoroughly researched, mostly in the batch setting with a fixed labelled dataset …

저장 인용 3회 인용 관련 학술자료 전체 6개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Toward effective soft robot control via reinforcement learning

H Zhang, R Cao, S Zilberstein, F Wu… - Intelligent Robotics and …, 2017 - Springer

A soft robot is a kind of robot that is constructed with soft, deformable and elastic materials.
Control of soft robots presents complex modeling and planning challenges. We introduce a …

저장 인용 54회 인용 관련 학술자료 전체 3개의 버전

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] Branching time active inference: the theory and its generality

T Champion, L Da Costa, H Bowman, M Grześ - Neural Networks, 2022 - Elsevier

Over the last 10 to 15 years, active inference has helped to explain various brain
mechanisms from habit formation to dopaminergic discharge and even modelling curiosity …

저장 인용 22회 인용 관련 학술자료 전체 15개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Online planning for large markov decision processes with hierarchical decomposition

A Bai, F Wu, X Chen - ACM Transactions on Intelligent Systems and …, 2015 - dl.acm.org

Markov decision processes (MDPs) provide a rich framework for planning under uncertainty.
However, exactly solving a large MDP is usually intractable due to the “curse of …

저장 인용 47회 인용 관련 학술자료 전체 5개의 버전

Automated conceptual design of mechanisms based on Thompson Sampling and Monte Carlo Tree Search

J Mao, Y Zhu, G Chen, C Yan, W Zhang - Applied Soft Computing, 2025 - Elsevier

Conceptual design of mechanisms is a crucial part of achieving product innovation as
mechanisms perform the transmission and transformation of specific motions in the machine …

저장 인용 관련 학술자료

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Bayesian mixture modelling and inference based Thompson sampling in Monte-Carlo tree search

Monte Carlo tree search: A review of recent modifications and applications

A tutorial on thompson sampling

Learning to optimize via information-directed sampling

Learning to optimize via information-directed sampling

Adaptive anytime multi-agent path finding using bandit-based large neighborhood search

Online learning of decision trees with thompson sampling

Toward effective soft robot control via reinforcement learning

[HTML][HTML] Branching time active inference: the theory and its generality

Online planning for large markov decision processes with hierarchical decomposition

Automated conceptual design of mechanisms based on Thompson Sampling and Monte Carlo Tree Search