Is Q-learning provably efficient?
Abstract Model-free reinforcement learning (RL) algorithms directly parameterize and
update value functions or policies, bypassing the modeling of the environment. They are …
update value functions or policies, bypassing the modeling of the environment. They are …
Provably efficient reinforcement learning with linear function approximation
Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …
with an enormous number of states, where\emph {function approximation} must be deployed …
Pessimistic q-learning for offline reinforcement learning: Towards optimal sample complexity
Offline or batch reinforcement learning seeks to learn a near-optimal policy using history
data without active exploration of the environment. To counter the insufficient coverage and …
data without active exploration of the environment. To counter the insufficient coverage and …
Provably efficient exploration in policy optimization
While policy-based reinforcement learning (RL) achieves tremendous successes in practice,
it is significantly less understood in theory, especially compared with value-based RL. In …
it is significantly less understood in theory, especially compared with value-based RL. In …
Sample-optimal parametric q-learning using linearly additive features
Consider a Markov decision process (MDP) that admits a set of state-action features, which
can linearly express the process's probabilistic transition model. We propose a parametric Q …
can linearly express the process's probabilistic transition model. We propose a parametric Q …
Minimum cost flows, MDPs, and ℓ1-regression in nearly linear time for dense instances
In this paper we provide new randomized algorithms with improved runtimes for solving
linear programs with two-sided constraints. In the special case of the minimum cost flow …
linear programs with two-sided constraints. In the special case of the minimum cost flow …
Almost optimal model-free reinforcement learningvia reference-advantage decomposition
We study the reinforcement learning problem in the setting of finite-horizon1episodic Markov
Decision Processes (MDPs) with S states, A actions, and episode length H. We propose a …
Decision Processes (MDPs) with S states, A actions, and episode length H. We propose a …
[HTML][HTML] Interdisciplinary Perspectives on Agent-Based Modeling in the Architecture, Engineering, and Construction Industry: A Comprehensive Review
S Mazzetto - Buildings, 2024 - mdpi.com
This paper explores the transformative impact of agent-based modeling (ABM) on the
architecture, engineering, and construction (AEC) industry, highlighting its indispensable …
architecture, engineering, and construction (AEC) industry, highlighting its indispensable …
Near-optimal time and sample complexities for solving Markov decision processes with a generative model
In this paper we consider the problem of computing an $\epsilon $-optimal policy of a
discounted Markov Decision Process (DMDP) provided we can only access its transition …
discounted Markov Decision Process (DMDP) provided we can only access its transition …
Model-based reinforcement learning with a generative model is minimax optimal
This work considers the sample and computational complexity of obtaining an $\epsilon $-
optimal policy in a discounted Markov Decision Process (MDP), given only access to a …
optimal policy in a discounted Markov Decision Process (MDP), given only access to a …