Recent advances in reinforcement learning in finance

B Hambly, R Xu, H Yang - Mathematical Finance, 2023 - Wiley Online Library
The rapid changes in the finance industry due to the increasing amount of data have
revolutionized the techniques on data processing and data analysis and brought new …

Motif: Intrinsic motivation from artificial intelligence feedback

M Klissarov, P D'Oro, S Sodhani, R Raileanu… - arxiv preprint arxiv …, 2023 - arxiv.org
Exploring rich environments and evaluating one's actions without prior knowledge is
immensely challenging. In this paper, we propose Motif, a general method to interface such …

Guarantees for epsilon-greedy reinforcement learning with function approximation

C Dann, Y Mansour, M Mohri… - International …, 2022 - proceedings.mlr.press
Myopic exploration policies such as epsilon-greedy, softmax, or Gaussian noise fail to
explore efficiently in some reinforcement learning tasks and yet, they perform well in many …

On the importance of exploration for generalization in reinforcement learning

Y Jiang, JZ Kolter, R Raileanu - Advances in Neural …, 2023 - proceedings.neurips.cc
Existing approaches for improving generalization in deep reinforcement learning (RL) have
mostly focused on representation learning, neglecting RL-specific aspects such as …

Temporal abstraction in reinforcement learning with the successor representation

MC Machado, A Barreto, D Precup… - Journal of machine …, 2023 - jmlr.org
Reasoning at multiple levels of temporal abstraction is one of the key attributes of
intelligence. In reinforcement learning, this is often modeled through temporally extended …

Reinforcement learning: An overview

K Murphy - arxiv preprint arxiv:2412.05265, 2024 - arxiv.org
This manuscript gives a big-picture, up-to-date overview of the field of (deep) reinforcement
learning and sequential decision making, covering value-based RL, policy-gradient …

UAV path planning optimization strategy: Considerations of urban morphology, microclimate, and energy efficiency using Q-learning algorithm

A Souto, R Alfaia, E Cardoso, J Araújo, C Francês - Drones, 2023 - mdpi.com
The use of unmanned aerial vehicles (UAVS) has been suggested as a potential
communications alternative due to their fast implantation, which makes this resource an …

Deep laplacian-based options for temporally-extended exploration

M Klissarov, MC Machado - arxiv preprint arxiv:2301.11181, 2023 - arxiv.org
Selecting exploratory actions that generate a rich stream of experience for better learning is
a fundamental challenge in reinforcement learning (RL). An approach to tackle this problem …

Temporl: Learning when to act

A Biedenkapp, R Rajan, F Hutter… - … on Machine Learning, 2021 - proceedings.mlr.press
Reinforcement learning is a powerful approach to learn behaviour through interactions with
an environment. However, behaviours are usually learned in a purely reactive fashion …

Automated reinforcement learning (autorl): A survey and open problems

J Parker-Holder, R Rajan, X Song, A Biedenkapp… - Journal of Artificial …, 2022 - jair.org
Abstract The combination of Reinforcement Learning (RL) with deep learning has led to a
series of impressive feats, with many believing (deep) RL provides a path towards generally …