Reinforcement learning: A tutorial survey and recent advances
A Gosavi - INFORMS Journal on Computing, 2009 - pubsonline.informs.org
In the last few years, reinforcement learning (RL), also called adaptive (or approximate)
dynamic programming, has emerged as a powerful tool for solving complex sequential …
dynamic programming, has emerged as a powerful tool for solving complex sequential …
[КНИГА][B] Control systems and reinforcement learning
S Meyn - 2022 - books.google.com
A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …
[КНИГА][B] Markov chains: Basic definitions
R Douc, E Moulines, P Priouret, P Soulier, R Douc… - 2018 - Springer
Heuristically, a discrete-time stochastic process has the Markov property if the past and
future are independent given the present. In this introductory chapter, we give the formal …
future are independent given the present. In this introductory chapter, we give the formal …
Operational Research: methods and applications
Abstract Throughout its history, Operational Research has evolved to include methods,
models and algorithms that have been applied to a wide range of contexts. This …
models and algorithms that have been applied to a wide range of contexts. This …
[КНИГА][B] Markov chains and stochastic stability
SP Meyn, RL Tweedie - 2012 - books.google.com
Markov Chains and Stochastic Stability is part of the Communications and Control
Engineering Series (CCES) edited by Professors BW Dickinson, ED Sontag, M. Thoma, A …
Engineering Series (CCES) edited by Professors BW Dickinson, ED Sontag, M. Thoma, A …
The ODE method for convergence of stochastic approximation and reinforcement learning
It is shown here that stability of the stochastic approximation algorithm is implied by the
asymptotic stability of the origin for an associated ODE. This in turn implies convergence of …
asymptotic stability of the origin for an associated ODE. This in turn implies convergence of …
On positive Harris recurrence of multiclass queueing networks: a unified approach via fluid limit models
JG Dai - The Annals of Applied Probability, 1995 - projecteuclid.org
It is now known that the usual traffic condition (the nominal load being less than 1 at each
station) is not sufficient for stability for a multiclass open queueing network. Although there …
station) is not sufficient for stability for a multiclass open queueing network. Although there …
Stochastic networked control systems
Our goal in writing this book has been to provide a comprehensive, mathematically rigorous,
but still accessible treatment of the interaction between information and control in multi …
but still accessible treatment of the interaction between information and control in multi …
[КНИГА][B] Control techniques for complex networks
S Meyn - 2008 - books.google.com
Power grids, flexible manufacturing, cellular communications: interconnectedness has
consequences. This remarkable book gives the tools and philosophy you need to build …
consequences. This remarkable book gives the tools and philosophy you need to build …
[PDF][PDF] Scheduling for multiple flows sharing a time-varying channel: The exponential rule
S Shakkottai, AL Stolyar - Translations of the American …, 2002 - researchgate.net
We consider the following queueing system which arises as a model of a wireless link
shared by multiple users. Multiple ows must be served by a\channel"(server). The channel …
shared by multiple users. Multiple ows must be served by a\channel"(server). The channel …