Reinforcement learning for crop management support: Review, prospects and challenges

R Gautron, OA Maillard, P Preux, M Corbeels… - … and Electronics in …, 2022 - Elsevier
Reinforcement learning (RL), including multi-armed bandits, is a branch of machine learning
that deals with the problem of sequential decision-making in uncertain and unknown …

[CARTE][B] Simulation-based optimization

A Gosavi - 2015 - Springer
This book is written for students and researchers in the field of industrial engineering,
computer science, operations research, management science, electrical engineering, and …

A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution

D Hendricks, D Wilcox - 2014 IEEE Conference on …, 2014 - ieeexplore.ieee.org
Reinforcement learning is explored as a candidate machine learning technique to enhance
existing analytical solutions for optimal trade execution with elements from the market …

Efficient energy management in smart grids with finite horizon Q-learning

VP Vivek, S Bhatnagar - Sustainable Energy, Grids and Networks, 2024 - Elsevier
Efficient energy distribution in smart grids is an important problem driven by the need to
manage increasing power consumption across the globe. This problem has been studied in …

[HTML][HTML] A simple learning agent interacting with an agent-based market model

M Dicks, A Paskaramoorthy, T Gebbie - Physica A: Statistical Mechanics …, 2024 - Elsevier
We consider the learning dynamics of a single reinforcement learning optimal execution
trading agent when it interacts with an event-driven agent-based financial market model …

A policy gradient approach for finite horizon constrained Markov decision processes

S Guin, S Bhatnagar - 2023 62nd IEEE Conference on Decision …, 2023 - ieeexplore.ieee.org
The infinite horizon setting is widely adopted for problems of reinforcement learning (RL).
These invariably result in stationary policies that are optimal. In many situations, finite …

An adaptive bilateral negotiation model for e-commerce settings

V Narayanan, NR Jennings - Seventh IEEE International …, 2005 - ieeexplore.ieee.org
This paper studies adaptive bilateral negotiation between software agents in e-commerce
environments. Specifically, we assume that the agents are self-interested, the environment is …

Finite horizon q-learning: Stability, convergence, simulations and an application on smart grids

V VP, DS Bhatnagar - arxiv preprint arxiv:2110.15093, 2021 - arxiv.org
Q-learning is a popular reinforcement learning algorithm. This algorithm has however been
studied and analysed mainly in the infinite horizon setting. There are several important …

Structured prediction with reinforcement learning

F Maes, L Denoyer, P Gallinari - Machine learning, 2009 - Springer
We formalize the problem of Structured Prediction as a Reinforcement Learning task. We
first define a Structured Prediction Markov Decision Process (SP-MDP), an instantiation of …

Experience replay using transition sequences

TG Karimpanal, R Bouffanais - Frontiers in neurorobotics, 2018 - frontiersin.org
Experience replay is one of the most commonly used approaches to improve the sample
efficiency of reinforcement learning algorithms. In this work, we propose an approach to …