Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Recent advances in reinforcement learning in finance
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …
revolutionized the techniques on data processing and data analysis and brought new …
Multi-agent reinforcement learning: A selective overview of theories and algorithms
Recent years have witnessed significant advances in reinforcement learning (RL), which
has registered tremendous success in solving various sequential decision-making problems …
has registered tremendous success in solving various sequential decision-making problems …
Online robust reinforcement learning with model uncertainty
Robust reinforcement learning (RL) is to find a policy that optimizes the worst-case
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …
performance over an uncertainty set of MDPs. In this paper, we focus on model-free robust …
A finite-time analysis of two time-scale actor-critic methods
Actor-critic (AC) methods have exhibited great empirical success compared with other
reinforcement learning algorithms, where the actor uses the policy gradient to improve the …
reinforcement learning algorithms, where the actor uses the policy gradient to improve the …
Provably efficient reinforcement learning for discounted mdps with feature map**
Modern tasks in reinforcement learning have large state and action spaces. To deal with
them efficiently, one often uses predefined feature map** to represent states and actions …
them efficiently, one often uses predefined feature map** to represent states and actions …
On the Convergence and Sample Complexity Analysis of Deep Q-Networks with -Greedy Exploration
This paper provides a theoretical understanding of deep Q-Network (DQN) with the
$\varepsilon $-greedy exploration in deep reinforcement learning. Despite the tremendous …
$\varepsilon $-greedy exploration in deep reinforcement learning. Despite the tremendous …
Neural temporal-difference learning converges to global optima
Abstract Temporal-difference learning (TD), coupled with neural networks, is among the
most fundamental building blocks of deep reinforcement learning. However, due to the …
most fundamental building blocks of deep reinforcement learning. However, due to the …
Actor-critic reinforcement learning for control with stability guarantee
Reinforcement Learning (RL) and its integration with deep learning have achieved
impressive performance in various robotic control tasks, ranging from motion planning and …
impressive performance in various robotic control tasks, ranging from motion planning and …
Improving sample complexity bounds for (natural) actor-critic algorithms
The actor-critic (AC) algorithm is a popular method to find an optimal policy in reinforcement
learning. In the infinite horizon scenario, the finite-sample convergence rate for the AC and …
learning. In the infinite horizon scenario, the finite-sample convergence rate for the AC and …
Policy optimization provably converges to Nash equilibria in zero-sum linear quadratic games
We study the global convergence of policy optimization for finding the Nash equilibria (NE)
in zero-sum linear quadratic (LQ) games. To this end, we first investigate the landscape of …
in zero-sum linear quadratic (LQ) games. To this end, we first investigate the landscape of …