Google Učenjak

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library

The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Shrani Navedi Navedeno v 220 virih Sorodni članki Vse različice: 14

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Multi-agent reinforcement learning: A selective overview of theories and algorithms

K Zhang, Z Yang, T Başar - Handbook of reinforcement learning and …, 2021 - Springer

Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …

Shrani Navedi Navedeno v 1728 virih Sorodni članki Vse različice: 7

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Online robust reinforcement learning with model uncertainty

Y Wang, S Zou - Advances in Neural Information Processing …, 2021 - proceedings.neurips.cc

Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …

Shrani Navedi Navedeno v 115 virih Sorodni članki Vse različice: 10 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A finite-time analysis of two time-scale actor-critic methods

YF Wu, W Zhang, P Xu, Q Gu - Advances in Neural …, 2020 - proceedings.neurips.cc

Actor-critic (AC) methods have exhibited great empirical success compared with other
reinforcement learning algorithms, where the actor uses the policy gradient to improve the …

Shrani Navedi Navedeno v 171 virih Sorodni članki Vse različice: 7 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] mlr.press

Provably efficient reinforcement learning for discounted mdps with feature map**

D Zhou, J He, Q Gu - International Conference on Machine …, 2021 - proceedings.mlr.press

Modern tasks in reinforcement learning have large state and action spaces. To deal with
them efficiently, one often uses predefined feature map** to represent states and actions …

Shrani Navedi Navedeno v 150 virih Sorodni članki Vse različice: 6 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with -Greedy Exploration

S Zhang, H Li, M Wang, M Liu… - Advances in …, 2023 - proceedings.neurips.cc

This paper provides a theoretical understanding of deep Q-Network (DQN) with the
$\varepsilon $-greedy exploration in deep reinforcement learning. Despite the tremendous …

Shrani Navedi Navedeno v 24 virih Sorodni članki Vse različice: 8 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Neural temporal-difference learning converges to global optima

Q Cai, Z Yang, JD Lee, Z Wang - Advances in Neural …, 2019 - proceedings.neurips.cc

Abstract Temporal-difference learning (TD), coupled with neural networks, is among the
most fundamental building blocks of deep reinforcement learning. However, due to the …

Shrani Navedi Navedeno v 157 virih Sorodni članki Vse različice: 13 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Actor-critic reinforcement learning for control with stability guarantee

M Han, L Zhang, J Wang, W Pan - IEEE Robotics and …, 2020 - ieeexplore.ieee.org

Reinforcement Learning (RL) and its integration with deep learning have achieved
impressive performance in various robotic control tasks, ranging from motion planning and …

Shrani Navedi Navedeno v 122 virih Sorodni članki Vse različice: 11

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Improving sample complexity bounds for (natural) actor-critic algorithms

T Xu, Z Wang, Y Liang - Advances in Neural Information …, 2020 - proceedings.neurips.cc

The actor-critic (AC) algorithm is a popular method to find an optimal policy in reinforcement
learning. In the infinite horizon scenario, the finite-sample convergence rate for the AC and …

Shrani Navedi Navedeno v 121 virih Sorodni članki Vse različice: 8 V obliki HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games

K Zhang, Z Yang, T Basar - Advances in Neural Information …, 2019 - proceedings.neurips.cc

We study the global convergence of policy optimization for finding the Nash equilibria (NE)
in zero-sum linear quadratic (LQ) games. To this end, we first investigate the landscape of …

Shrani Navedi Navedeno v 150 virih Sorodni članki Vse različice: 10 V obliki HTML

Ustvari opozorilo

Navedi

Napredno iskanje

Shranjeno v Mojo knjižnico

Finite-sample analysis for sarsa with linear function approximation

Recent advances in reinforcement learning in finance

Multi-agent reinforcement learning: A selective overview of theories and algorithms

Online robust reinforcement learning with model uncertainty

A finite-time analysis of two time-scale actor-critic methods

Provably efficient reinforcement learning for discounted mdps with feature map**

On the Convergence and Sample Complexity Analysis of Deep Q-Networks with -Greedy Exploration

Neural temporal-difference learning converges to global optima

Actor-critic reinforcement learning for control with stability guarantee

Improving sample complexity bounds for (natural) actor-critic algorithms

Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games