Recent advances in reinforcement learning in finance

B Hambly, R Xu, H Yang - Mathematical Finance, 2023‏ - Wiley Online Library
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

A survey on model-based reinforcement learning

FM Luo, T Xu, H Lai, XH Chen, W Zhang… - Science China Information …, 2024‏ - Springer
Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …

[ספר][B] Algorithms for decision making

MJ Kochenderfer, TA Wheeler, KH Wray - 2022‏ - books.google.com
A broad introduction to algorithms for decision making under uncertainty, introducing the
underlying mathematical problem formulations and the algorithms for solving them …

Morel: Model-based offline reinforcement learning

R Kidambi, A Rajeswaran… - Advances in neural …, 2020‏ - proceedings.neurips.cc
In offline reinforcement learning (RL), the goal is to learn a highly rewarding policy based
solely on a dataset of historical interactions with the environment. This serves as an extreme …

Reinforcement learning in healthcare: A survey

C Yu, J Liu, S Nemati, G Yin - ACM Computing Surveys (CSUR), 2021‏ - dl.acm.org
As a subfield of machine learning, reinforcement learning (RL) aims at optimizing decision
making by using interaction samples of an agent with its environment and the potentially …

An introduction to deep reinforcement learning

V François-Lavet, P Henderson, R Islam… - … and Trends® in …, 2018‏ - nowpublishers.com
Deep reinforcement learning is the combination of reinforcement learning (RL) and deep
learning. This field of research has been able to solve a wide range of complex …

Learning to explore using active neural slam

DS Chaplot, D Gandhi, S Gupta, A Gupta… - arxiv preprint arxiv …, 2020‏ - arxiv.org
This work presents a modular and hierarchical approach to learn policies for exploring 3D
environments, calledActive Neural SLAM'. Our approach leverages the strengths of both …

On the theory of policy gradient methods: Optimality, approximation, and distribution shift

A Agarwal, SM Kakade, JD Lee, G Mahajan - Journal of Machine Learning …, 2021‏ - jmlr.org
Policy gradient methods are among the most effective methods in challenging reinforcement
learning problems with large state and/or action spaces. However, little is known about even …

Policy finetuning: Bridging sample-efficient offline and online reinforcement learning

T **e, N Jiang, H Wang, C **ong… - Advances in neural …, 2021‏ - proceedings.neurips.cc
Recent theoretical work studies sample-efficient reinforcement learning (RL) extensively in
two settings: learning interactively in the environment (online RL), or learning from an offline …

[ספר][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020‏ - books.google.com
Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …