- Academic Search

B Scherrer - Advances in Neural Information Processing …, 2013 - proceedings.neurips.cc

Abstract Given a Markov Decision Process (MDP) with $ n $ states and $ m $ actions per
state, we study the number of iterations needed by Policy Iteration (PI) algorithms to …

Save Cite Cited by 98 Related articles All 21 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mdpi.com

Multi-gear bandits, partial conservation laws, and indexability

J Niño-Mora - Mathematics, 2022 - mdpi.com

This paper considers what we propose to call multi-gear bandits, which are Markov decision
processes modeling a generic dynamic and stochastic project fueled by a single resource …

Save Cite Cited by 6 Related articles All 5 versions Free GPT-4 Cached

[Free GPT-4]

[PDF] acm.org

The Smoothed Complexity of Policy Iteration for Markov Decision Processes

M Christ, M Yannakakis - Proceedings of the 55th Annual ACM …, 2023 - dl.acm.org

We show subexponential lower bounds (ie, 2Ω (nc)) on the smoothed complexity of the
classical Howard's Policy Iteration algorithm for Markov Decision Processes. The bounds …

Save Cite Cited by 2 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

Geometric policy iteration for Markov decision processes

Y Wu, JA De Loera - Proceedings of the 28th ACM SIGKDD Conference …, 2022 - dl.acm.org

Recently discovered polyhedral structures of the value function for finite discounted Markov
decision processes (MDP) shed light on understanding the success of reinforcement …

Save Cite Cited by 5 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] aaai.org

Randomised procedures for initialising and switching actions in policy iteration

S Kalyanakrishnan, N Misra, A Gopalan - Proceedings of the AAAI …, 2016 - ojs.aaai.org

Abstract Policy Iteration (PI)(Howard 1960) is a classical method for computing an optimal
policy for a finite Markov Decision Problem (MDP). The method is conceptually simple …

Save Cite Cited by 8 Related articles All 6 versions Free GPT-4 View as HTML

[Free GPT-4]

[HTML] sciencedirect.com

[HTML][HTML] A complexity analysis of Policy Iteration through combinatorial matrices arising from Unique Sink Orientations

B Gerencsér, R Hollanders, JC Delvenne… - Journal of Discrete …, 2017 - Elsevier

Abstract Unique Sink Orientations (USOs) are an appealing abstraction of several major
optimization problems of applied mathematics such as Linear Programming (LP), Markov …

Save Cite Cited by 8 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Upper Bounds for All and Max-gain Policy Iteration Algorithms on Deterministic MDPs

R Goenka, E Gupta, S Khyalia, P Agarwal… - arxiv preprint arxiv …, 2022 - arxiv.org

Policy Iteration (PI) is a widely used family of algorithms to compute optimal policies for
Markov Decision Problems (MDPs). We derive upper bounds on the running time of PI on …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

A low-rank approximation for MDPs via moment coupling

ABZ Zhang, I Gurvich - Operations Research, 2024 - pubsonline.informs.org

We introduce a framework to approximate Markov decision processes (MDPs) that stands on
two pillars:(i) state aggregation, as the algorithmic infrastructure, and (ii) central-limit …

Save Cite Cited by 1 Related articles All 7 versions Free GPT-4

[Free GPT-4]

[PDF] cornell.edu

[BOOK][B] Exploiting Model Smoothness in Dynamic Decisions

ABZ Zhang - 2022 - search.proquest.com

Utilizing structure in mathematical modeling is instrumental for better model design, creation,
and solution. In this dissertation, we explore smoothness-based structure for problems …

[Free GPT-4]

[PDF] iitb.ac.in

[PDF][PDF] Theoretical Analysis of Policy Iteration

S Kalyanakrishnan - 2017 - cse.iitb.ac.in

Theoretical Analysis of Policy Iteration Page 1 1/31 Theoretical Analysis of Policy Iteration
Shivaram Kalyanakrishnan Department of Computer Science and Engineering Indian Institute …

Save Cite Cited by 1 Related articles All 2 versions Free GPT-4 View as HTML

Cite

Advanced search

Saved to My library

Improved and generalized upper bounds on the complexity of policy iteration

Multi-gear bandits, partial conservation laws, and indexability

The Smoothed Complexity of Policy Iteration for Markov Decision Processes

Geometric policy iteration for Markov decision processes

Randomised procedures for initialising and switching actions in policy iteration

[HTML][HTML] A complexity analysis of Policy Iteration through combinatorial matrices arising from Unique Sink Orientations

Upper Bounds for All and Max-gain Policy Iteration Algorithms on Deterministic MDPs

A low-rank approximation for MDPs via moment coupling

[BOOK][B] Exploiting Model Smoothness in Dynamic Decisions

[PDF][PDF] Theoretical Analysis of Policy Iteration