Recent advances in reinforcement learning in finance

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Provably efficient reinforcement learning with linear function approximation

C **, Z Yang, Z Wang… - Conference on learning …, 2020 - proceedings.mlr.press
Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …

Model-based reinforcement learning with value-targeted regression

A Ayoub, Z Jia, C Szepesvari… - … on Machine Learning, 2020 - proceedings.mlr.press
This paper studies model-based reinforcement learning (RL) for regret minimization. We
focus on finite-horizon episodic RL where the transition model $ P $ belongs to a known …

Flambe: Structural complexity and representation learning of low rank mdps

A Agarwal, S Kakade… - Advances in neural …, 2020 - proceedings.neurips.cc
In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common
practice to make parametric assumptions where values or policies are functions of some low …

Representation learning for online and offline rl in low-rank mdps

M Uehara, X Zhang, W Sun - arxiv preprint arxiv:2110.04652, 2021 - arxiv.org
This work studies the question of Representation Learning in RL: how can we learn a
compact low-dimensional representation such that on top of the representation we can …

Pessimistic model-based offline reinforcement learning under partial coverage

M Uehara, W Sun - arxiv preprint arxiv:2107.06226, 2021 - arxiv.org
We study model-based offline Reinforcement Learning with general function approximation
without a full coverage assumption on the offline data distribution. We present an algorithm …

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press
We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

Provably efficient exploration in policy optimization

Q Cai, Z Yang, C **, Z Wang - International Conference on …, 2020 - proceedings.mlr.press
While policy-based reinforcement learning (RL) achieves tremendous successes in practice,
it is significantly less understood in theory, especially compared with value-based RL. In …

Provable benefits of actor-critic methods for offline reinforcement learning

A Zanette, MJ Wainwright… - Advances in neural …, 2021 - proceedings.neurips.cc
Actor-critic methods are widely used in offline reinforcement learningpractice, but are not so
well-understood theoretically. We propose a newoffline actor-critic algorithm that naturally …

Neural contextual bandits with ucb-based exploration

D Zhou, L Li, Q Gu - International Conference on Machine …, 2020 - proceedings.mlr.press
We study the stochastic contextual bandit problem, where the reward is generated from an
unknown function with additive noise. No assumption is made about the reward function …