Академия Google

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library

The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Сохранить Цитировать Цитируется: 211 Похожие статьи Все версии статьи (13)

[Free GPT-4]

[PDF] mlr.press

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

D Zhou, Q Gu, C Szepesvari - Conference on Learning …, 2021 - proceedings.mlr.press

We study reinforcement learning (RL) with linear function approximation where the
underlying transition probability kernel of the Markov decision process (MDP) is a linear …

Сохранить Цитировать Цитируется: 245 Похожие статьи Все версии статьи (7) В виде HTML

[Free GPT-4]

[PDF] tor-lattimore.com

[КНИГА][B] Bandit algorithms

T Lattimore, C Szepesvári - 2020 - books.google.com

Decision-making in the face of uncertainty is a significant challenge in machine learning,
and the multi-armed bandit model is a commonly used framework to address it. This …

Сохранить Цитировать Цитируется: 3285 Похожие статьи Все версии статьи (9) Поиск в библиотеках

[Free GPT-4]

[PDF] mlr.press

Provably efficient exploration in policy optimization

Q Cai, Z Yang, C **, Z Wang - International Conference on …, 2020 - proceedings.mlr.press

While policy-based reinforcement learning (RL) achieves tremendous successes in practice,
it is significantly less understood in theory, especially compared with value-based RL. In …

Сохранить Цитировать Цитируется: 324 Похожие статьи Все версии статьи (9) В виде HTML

[Free GPT-4]

[PDF] neurips.cc

Flambe: Structural complexity and representation learning of low rank mdps

A Agarwal, S Kakade… - Advances in neural …, 2020 - proceedings.neurips.cc

In order to deal with the curse of dimensionality in reinforcement learning (RL), it is common
practice to make parametric assumptions where values or policies are functions of some low …

Сохранить Цитировать Цитируется: 296 Похожие статьи Все версии статьи (10) В виде HTML

[Free GPT-4]

[PDF] mlr.press

Learning near optimal policies with low inherent bellman error

A Zanette, A Lazaric, M Kochenderfer… - International …, 2020 - proceedings.mlr.press

We study the exploration problem with approximate linear action-value functions in episodic
reinforcement learning under the notion of low inherent Bellman error, a condition normally …

Сохранить Цитировать Цитируется: 257 Похожие статьи Все версии статьи (5) В виде HTML

[Free GPT-4]

[PDF] arxiv.org

Is a good representation sufficient for sample efficient reinforcement learning?

SS Du, SM Kakade, R Wang, LF Yang - ar**

D Zhou, J He, Q Gu - International Conference on Machine …, 2021 - proceedings.mlr.press

Modern tasks in reinforcement learning have large state and action spaces. To deal with
them efficiently, one often uses predefined feature map** to represent states and actions …

Сохранить Цитировать Цитируется: 148 Похожие статьи Все версии статьи (5) В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Learning with good feature representations in bandits and in rl with a generative model

Recent advances in reinforcement learning in finance

Nearly minimax optimal reinforcement learning for linear mixture markov decision processes

[КНИГА][B] Bandit algorithms

Provably efficient exploration in policy optimization

Flambe: Structural complexity and representation learning of low rank mdps

Learning near optimal policies with low inherent bellman error

Is a good representation sufficient for sample efficient reinforcement learning?