Google Academic

R Gautron, OA Maillard, P Preux, M Corbeels… - … and Electronics in …, 2022 - Elsevier

Reinforcement learning (RL), including multi-armed bandits, is a branch of machine learning
that deals with the problem of sequential decision-making in uncertain and unknown …

Salvați Citați Citat de 62 ori Articole cu conținut similar Toate cele 15 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

[CARTE][B] Simulation-based optimization

A Gosavi - 2015 - Springer

This book is written for students and researchers in the field of industrial engineering,
computer science, operations research, management science, electrical engineering, and …

Salvați Citați Citat de 846 ori Articole cu conținut similar Toate cele 11 versiuni Căutare Bibliotecă

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution

D Hendricks, D Wilcox - 2014 IEEE Conference on …, 2014 - ieeexplore.ieee.org

Reinforcement learning is explored as a candidate machine learning technique to enhance
existing analytical solutions for optimal trade execution with elements from the market …

Salvați Citați Citat de 112 ori Articole cu conținut similar Toate cele 7 versiuni

Efficient energy management in smart grids with finite horizon Q-learning

VP Vivek, S Bhatnagar - Sustainable Energy, Grids and Networks, 2024 - Elsevier

Efficient energy distribution in smart grids is an important problem driven by the need to
manage increasing power consumption across the globe. This problem has been studied in …

Salvați Citați Citat de 4 ori Articole cu conținut similar Toate cele 2 versiuni

[Free GPT-4]
[DeepSeek]

[HTML] sciencedirect.com

[HTML][HTML] A simple learning agent interacting with an agent-based market model

M Dicks, A Paskaramoorthy, T Gebbie - Physica A: Statistical Mechanics …, 2024 - Elsevier

We consider the learning dynamics of a single reinforcement learning optimal execution
trading agent when it interacts with an event-driven agent-based financial market model …

Salvați Citați Citat de 8 ori Articole cu conținut similar Toate cele 7 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A policy gradient approach for finite horizon constrained Markov decision processes

S Guin, S Bhatnagar - 2023 62nd IEEE Conference on Decision …, 2023 - ieeexplore.ieee.org

The infinite horizon setting is widely adopted for problems of reinforcement learning (RL).
These invariably result in stationary policies that are optimal. In many situations, finite …

Salvați Citați Citat de 8 ori Articole cu conținut similar Toate cele 4 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] soton.ac.uk

An adaptive bilateral negotiation model for e-commerce settings

V Narayanan, NR Jennings - Seventh IEEE International …, 2005 - ieeexplore.ieee.org

This paper studies adaptive bilateral negotiation between software agents in e-commerce
environments. Specifically, we assume that the agents are self-interested, the environment is …

Salvați Citați Citat de 73 ori Articole cu conținut similar Toate cele 9 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Finite horizon q-learning: Stability, convergence, simulations and an application on smart grids

V VP, DS Bhatnagar - arxiv preprint arxiv:2110.15093, 2021 - arxiv.org

Q-learning is a popular reinforcement learning algorithm. This algorithm has however been
studied and analysed mainly in the infinite horizon setting. There are several important …

Salvați Citați Citat de 10 ori Articole cu conținut similar Toate cele 2 versiuni Afișare ca HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Structured prediction with reinforcement learning

F Maes, L Denoyer, P Gallinari - Machine learning, 2009 - Springer

We formalize the problem of Structured Prediction as a Reinforcement Learning task. We
first define a Structured Prediction Markov Decision Process (SP-MDP), an instantiation of …

Salvați Citați Citat de 45 ori Articole cu conținut similar Toate cele 16 versiuni

[Free GPT-4]
[DeepSeek]

[PDF] frontiersin.org

Experience replay using transition sequences

TG Karimpanal, R Bouffanais - Frontiers in neurorobotics, 2018 - frontiersin.org

Experience replay is one of the most commonly used approaches to improve the sample
efficiency of reinforcement learning algorithms. In this work, we propose an approach to …

Salvați Citați Citat de 21 ori Articole cu conținut similar Toate cele 14 versiuni În cache

Creează alerta

Citați

Căutare avansată

Salvat în Bibliotecă

A learning rate analysis of reinforcement learning algorithms in finite-horizon

Reinforcement learning for crop management support: Review, prospects and challenges

[CARTE][B] Simulation-based optimization

A reinforcement learning extension to the Almgren-Chriss framework for optimal trade execution

Efficient energy management in smart grids with finite horizon Q-learning

[HTML][HTML] A simple learning agent interacting with an agent-based market model

A policy gradient approach for finite horizon constrained Markov decision processes

An adaptive bilateral negotiation model for e-commerce settings

Finite horizon q-learning: Stability, convergence, simulations and an application on smart grids

Structured prediction with reinforcement learning

Experience replay using transition sequences