Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A practitioner's guide to MDP model checking algorithms
Abstract Model checking undiscounted reachability and expected-reward properties on
Markov decision processes (MDPs) is key for the verification of systems that act under …
Markov decision processes (MDPs) is key for the verification of systems that act under …
Strategy iteration is strongly polynomial for 2-player turn-based stochastic games with a constant discount factor
Ye [2011] showed recently that the simplex method with Dantzig's pivoting rule, as well as
Howard's policy iteration algorithm, solve discounted Markov decision processes (MDPs) …
Howard's policy iteration algorithm, solve discounted Markov decision processes (MDPs) …
Optimal convergence rate for exact policy mirror descent in discounted markov decision processes
Abstract Policy Mirror Descent (PMD) is a general family of algorithms that covers a wide
range of novel and fundamental methods in reinforcement learning. Motivated by the …
range of novel and fundamental methods in reinforcement learning. Motivated by the …
The simplex and policy-iteration methods are strongly polynomial for the Markov decision problem with a fixed discount rate
Y Ye - Mathematics of Operations Research, 2011 - pubsonline.informs.org
We prove that the classic policy-iteration method [Howard, RA 1960. Dynamic Programming
and Markov Processes. MIT, Cambridge] and the original simplex method with the most …
and Markov Processes. MIT, Cambridge] and the original simplex method with the most …
Improved and generalized upper bounds on the complexity of policy iteration
B Scherrer - Advances in Neural Information Processing …, 2013 - proceedings.neurips.cc
Abstract Given a Markov Decision Process (MDP) with $ n $ states and $ m $ actions per
state, we study the number of iterations needed by Policy Iteration (PI) algorithms to …
state, we study the number of iterations needed by Policy Iteration (PI) algorithms to …
Acting in delayed environments with non-stationary markov policies
The standard Markov Decision Process (MDP) formulation hinges on the assumption that an
action is executed immediately after it was chosen. However, assuming it is often unrealistic …
action is executed immediately after it was chosen. However, assuming it is often unrealistic …
Universal trees grow inside separating automata: Quasi-polynomial lower bounds for parity games
Several distinct techniques have been proposed to design quasi-polynomial algorithms for
solving parity games since the breakthrough result of Calude, Jain, Khoussainov, Li, and …
solving parity games since the breakthrough result of Calude, Jain, Khoussainov, Li, and …
Subexponential lower bounds for randomized pivoting rules for the simplex algorithm
The simplex algorithm is among the most widely used algorithms for solving linear programs
in practice. With essentially all deterministic pivoting rules it is known, however, to require an …
in practice. With essentially all deterministic pivoting rules it is known, however, to require an …
Parity games: Zielonka's algorithm in quasi-polynomial time
P Parys - arxiv preprint arxiv:1904.12446, 2019 - arxiv.org
Calude, Jain, Khoussainov, Li, and Stephan (2017) proposed a quasi-polynomial-time
algorithm solving parity games. After this breakthrough result, a few other quasi-polynomial …
algorithm solving parity games. After this breakthrough result, a few other quasi-polynomial …
A recursive approach to solving parity games in quasipolynomial time
Zielonka's classic recursive algorithm for solving parity games is perhaps the simplest
among the many existing parity game algorithms. However, its complexity is exponential …
among the many existing parity game algorithms. However, its complexity is exponential …