- Academic Search

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library

The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Save Cite Cited by 206 Related articles All 13 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A survey on model-based reinforcement learning

FM Luo, T Xu, H Lai, XH Chen, W Zhang… - Science China Information …, 2024 - Springer

Reinforcement learning (RL) interacts with the environment to solve sequential decision-
making problems via a trial-and-error approach. Errors are always undesirable in real-world …

Save Cite Cited by 110 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Is pessimism provably efficient for offline rl?

Y **, Z Yang, Z Wang - International Conference on …, 2021 - proceedings.mlr.press

We study offline reinforcement learning (RL), which aims to learn an optimal policy based on
a dataset collected a priori. Due to the lack of further interactions with the environment …

Save Cite Cited by 449 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Provably efficient reinforcement learning with linear function approximation

C **, Z Yang, Z Wang… - Conference on learning …, 2020 - proceedings.mlr.press

Abstract Modern Reinforcement Learning (RL) is commonly applied to practical problems
with an enormous number of states, where\emph {function approximation} must be deployed …

Save Cite Cited by 771 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] neurips.cc

Bridging offline reinforcement learning and imitation learning: A tale of pessimism

P Rashidinejad, B Zhu, C Ma, J Jiao… - Advances in Neural …, 2021 - proceedings.neurips.cc

Offline (or batch) reinforcement learning (RL) algorithms seek to learn an optimal policy from
a fixed dataset without active data collection. Based on the composition of the offline dataset …

Save Cite Cited by 315 Related articles All 8 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] nowpublishers.com

Model-based reinforcement learning: A survey

TM Moerland, J Broekens, A Plaat… - … and Trends® in …, 2023 - nowpublishers.com

Sequential decision making, commonly formalized as Markov Decision Process (MDP)
optimization, is an important challenge in artificial intelligence. Two key approaches to this …

Save Cite Cited by 920 Related articles All 17 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] neurips.cc

Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms

C **, Q Liu, S Miryoosefi - Advances in neural information …, 2021 - proceedings.neurips.cc

Finding the minimal structural assumptions that empower sample-efficient learning is one of
the most important research directions in Reinforcement Learning (RL). This paper …

Save Cite Cited by 263 Related articles All 11 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press

We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

Save Cite Cited by 244 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Pessimistic q-learning for offline reinforcement learning: Towards optimal sample complexity

L Shi, G Li, Y Wei, Y Chen… - … conference on machine …, 2022 - proceedings.mlr.press

Offline or batch reinforcement learning seeks to learn a near-optimal policy using history
data without active exploration of the environment. To counter the insufficient coverage and …

Save Cite Cited by 106 Related articles All 10 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mlr.press

Model-based reinforcement learning with value-targeted regression

A Ayoub, Z Jia, C Szepesvari… - … on Machine Learning, 2020 - proceedings.mlr.press

This paper studies model-based reinforcement learning (RL) for regret minimization. We
focus on finite-horizon episodic RL where the transition model $ P $ belongs to a known …

Save Cite Cited by 351 Related articles All 8 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Is Q-learning provably efficient?

Recent advances in reinforcement learning in finance

A survey on model-based reinforcement learning

Is pessimism provably efficient for offline rl?

Provably efficient reinforcement learning with linear function approximation

Bridging offline reinforcement learning and imitation learning: A tale of pessimism

Model-based reinforcement learning: A survey

Bellman eluder dimension: New rich classes of rl problems, and sample-efficient algorithms

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

Pessimistic q-learning for offline reinforcement learning: Towards optimal sample complexity

Model-based reinforcement learning with value-targeted regression