[HTML][HTML] Reinforcement learning in urban network traffic signal control: A systematic literature review

M Noaeen, A Naik, L Goodman, J Crebo, T Abrar… - Expert Systems with …, 2022 - Elsevier
Improvement of traffic signal control (TSC) efficiency has been found to lead to improved
urban transportation and enhanced quality of life. Recently, the use of reinforcement …

Reinforcement learning methods for computation offloading: a systematic review

Z Zabihi, AM Eftekhari Moghadam… - ACM Computing …, 2023 - dl.acm.org
Today, cloud computation offloading may not be an appropriate solution for delay-sensitive
applications due to the long distance between end-devices and remote datacenters. In …

Gans trained by a two time-scale update rule converge to a local nash equilibrium

M Heusel, H Ramsauer, T Unterthiner… - Advances in neural …, 2017 - proceedings.neurips.cc
Abstract Generative Adversarial Networks (GANs) excel at creating realistic images with
complex models for which maximum likelihood is infeasible. However, the convergence of …

Actor-critic algorithms

V Konda, J Tsitsiklis - Advances in neural information …, 1999 - proceedings.neurips.cc
We propose and analyze a class of actor-critic algorithms for simulation-based optimization
of a Markov decision process over a parameterized family of randomized stationary policies …

Stochastic Mechanics Applications of

A Board - 2003 - Springer
The original work in recursive stochastic algorithms was by Robbins and Monro, who
developed and analyzed a recursive procedure for finding the root of a real-valued function …

Mildly conservative q-learning for offline reinforcement learning

J Lyu, X Ma, X Li, Z Lu - Advances in Neural Information …, 2022 - proceedings.neurips.cc
Offline reinforcement learning (RL) defines the task of learning from a static logged dataset
without continually interacting with the environment. The distribution shift between the …

A survey of actor-critic reinforcement learning: Standard and natural policy gradients

I Grondman, L Busoniu, GAD Lopes… - IEEE Transactions on …, 2012 - ieeexplore.ieee.org
Policy-gradient-based actor-critic algorithms are amongst the most popular algorithms in the
reinforcement learning framework. Their advantage of being able to search for optimal …

[BOOK][B] Control systems and reinforcement learning

S Meyn - 2022 - books.google.com
A high school student can create deep Q-learning code to control her robot, without any
understanding of the meaning of'deep'or'Q', or why the code sometimes fails. This book is …

Closing the gap: Tighter analysis of alternating stochastic gradient methods for bilevel problems

T Chen, Y Sun, W Yin - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Stochastic nested optimization, including stochastic compositional, min-max, and bilevel
optimization, is gaining popularity in many machine learning applications. While the three …

[BOOK][B] Reinforcement Learning and Stochastic Optimization: A Unified Framework for Sequential Decisions: by Warren B. Powell (ed.), Wiley (2022). Hardback. ISBN …

I Halperin - 2022 - Taylor & Francis
What is reinforcement learning? How is reinforcement learning different from stochastic
optimization? And finally, can it be used for applications to quantitative finance for my current …