Reinforcement learning with guarantees: a review

P Osinenko, D Dobriborsci, W Aumer - IFAC-PapersOnLine, 2022 - Elsevier
Reinforcement learning is concerned with a generic concept of an agent acting in an
environment. From the control theory standpoint, reinforcement learning may be considered …

[PDF][PDF] Dynamic potential-based reward sha**

SM Devlin, D Kudenko - 11th International Conference on …, 2012 - pure.york.ac.uk
Potential-based reward sha** can significantly improve the time needed to learn an
optimal policy and, in multiagent systems, the performance of the final joint-policy. It has …

Reward sha** in episodic reinforcement learning

M Grzes - 2017 - kar.kent.ac.uk
Recent advancements in reinforcement learning confirm that reinforcement learning
techniques can solve large scale problems leading to high quality autonomous decision …

[КНИГА][B] Multi-agent machine learning: A reinforcement approach

HM Schwartz - 2014 - books.google.com
The book begins with a chapter on traditional methods of supervised learning, covering
recursive least squares learning, mean square error methods, and stochastic approximation …

Graph convolutional recurrent networks for reward sha** in reinforcement learning

H Sami, J Bentahar, A Mourad, H Otrok, E Damiani - Information Sciences, 2022 - Elsevier
In this paper, we consider the problem of low-speed convergence in Reinforcement
Learning (RL). As a solution, various potential-based reward sha** techniques were …

Temporal-logic-based reward sha** for continuing reinforcement learning tasks

Y Jiang, S Bharadwaj, B Wu, R Shah, U Topcu… - Proceedings of the …, 2021 - ojs.aaai.org
In continuing tasks, average-reward reinforcement learning may be a more appropriate
problem formulation than the more common discounted reward formulation. As usual …

Optimizing anti-collision strategy for MASS: A safe reinforcement learning approach to improve maritime traffic safety

C Wang, X Zhang, H Gao, M Bashir, H Li… - Ocean & Coastal …, 2024 - Elsevier
Maritime autonomous surface ships (MASS) promise enhanced efficiency, reduced human
errors, and to improve maritime traffic safety. However, MASS navigation in complex …

[HTML][HTML] Exploring three pillars of construction robotics via dual-track quantitative analysis

Y Liu, AHB Alias, NA Haron, NA Bakar… - Automation in …, 2024 - Elsevier
Construction robotics has emerged as a leading technology in the construction industry. This
paper conducts an innovative dual-track quantitative comprehensive method to analyze the …

COLERGs-constrained safe reinforcement learning for realising MASS's risk-informed collision avoidance decision making

C Wang, X Zhang, H Gao, M Bashir, H Li… - Knowledge-Based …, 2024 - Elsevier
Maritime autonomous surface ship (MASS) represents a significant advancement in
maritime technology, offering the potential for increased efficiency, reduced operational …

Reward sha** in multiagent reinforcement learning for self-organizing systems in assembly tasks

B Huang, Y ** - Advanced Engineering Informatics, 2022 - Elsevier
Self-organizing systems feature flexibility and robustness for tasks that may endure changes
over time. Various methods, eg, applying task-field and social-field, have been proposed to …