[HTML][HTML] Reinforcement learning in urban network traffic signal control: A systematic literature review
Improvement of traffic signal control (TSC) efficiency has been found to lead to improved
urban transportation and enhanced quality of life. Recently, the use of reinforcement …
urban transportation and enhanced quality of life. Recently, the use of reinforcement …
A gentle introduction to reinforcement learning and its application in different fields
Due to the recent progress in Deep Neural Networks, Reinforcement Learning (RL) has
become one of the most important and useful technology. It is a learning method where a …
become one of the most important and useful technology. It is a learning method where a …
Deep reinforcement learning
SE Li - Reinforcement learning for sequential decision and …, 2023 - Springer
Similar to humans, RL agents use interactive learning to successfully obtain satisfactory
decision strategies. However, in many cases, it is desirable to learn directly from …
decision strategies. However, in many cases, it is desirable to learn directly from …
Modeling recommender ecosystems: Research challenges at the intersection of mechanism design, reinforcement learning and generative models
Modern recommender systems lie at the heart of complex ecosystems that couple the
behavior of users, content providers, advertisers, and other actors. Despite this, the focus of …
behavior of users, content providers, advertisers, and other actors. Despite this, the focus of …
Risk-averse offline reinforcement learning
Training Reinforcement Learning (RL) agents in high-stakes applications might be too
prohibitive due to the risk associated to exploration. Thus, the agent can only use data …
prohibitive due to the risk associated to exploration. Thus, the agent can only use data …
Attacker-centric view of a detection game against advanced persistent threats
Advanced persistent threats (APTs) are a major threat to cyber-security, causing significant
financial and privacy losses each year. In this paper, cumulative prospect theory (CPT) is …
financial and privacy losses each year. In this paper, cumulative prospect theory (CPT) is …
Off-policy risk assessment in contextual bandits
Even when unable to run experiments, practitioners can evaluate prospective policies, using
previously logged data. However, while the bandits literature has adopted a diverse set of …
previously logged data. However, while the bandits literature has adopted a diverse set of …
Stochastic games for the smart grid energy management with prospect prosumers
In this paper, the problem of the smart grid energy management under stochastic dynamics
is investigated. In the considered model, at the demand side, it is assumed that customers …
is investigated. In the considered model, at the demand side, it is assumed that customers …
Robust quadrupedal locomotion via risk-averse policy learning
The robustness of legged locomotion is crucial for quadrupedal robots in challenging
terrains. Recently, Reinforcement Learning (RL) has shown promising results in legged …
terrains. Recently, Reinforcement Learning (RL) has shown promising results in legged …
Concentration of risk measures: A Wasserstein distance approach
Known finite-sample concentration bounds for the Wasserstein distance between the
empirical and true distribution of a random variable are used to derive a two-sided …
empirical and true distribution of a random variable are used to derive a two-sided …