- Academic Search

S Wu, J Yao, H Fu, Y Tian, C Qian, Y Yang… - The Eleventh …, 2023 - openreview.net

Diversity is a growing research topic in Reinforcement Learning (RL). Previous research on
diversity has mainly focused on promoting diversity to encourage exploration and thereby …

Save Cite Cited by 17 Related articles View as HTML

[Free GPT-4]

[PDF] iospress.nl

Reinforcement learning by guided safe exploration

Q Yang, TD Simão, N Jansen, SH Tindemans… - ECAI 2023, 2023 - ebooks.iospress.nl

Safety is critical to broadening the application of reinforcement learning (RL). Often, we train
RL agents in a controlled environment, such as a laboratory, before deploying them in the …

Save Cite Cited by 7 Related articles All 9 versions Free GPT-4

[Free GPT-4]

[PDF] ru.nl

[PDF][PDF] Training and transferring safe policies in reinforcement learning

Q Yang, T Simão, N Jansen, S Tindemans, M Spaan - 2022 - repository.ubn.ru.nl

Safety is critical to broadening the application of reinforcement learning (RL). Often, RL
agents are trained in a controlled environment, such as a laboratory, before being deployed …

Save Cite Cited by 7 Related articles All 9 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Human-Modeling in Sequential Decision-Making: An Analysis through the Lens of Human-Aware AI

S Tulli, SL Vasileiou, S Sreedharan - arxiv preprint arxiv:2405.07773, 2024 - arxiv.org

" Human-aware" has become a popular keyword used to describe a particular class of AI
systems that are designed to work and interact with humans. While there exists a surprising …

[Free GPT-4]

[PDF] academia.edu

Inverse Reinforcement Learning with Learning and Leveraging Demonstrators' Varying Expertise Levels

S Oguchienti, M Ghasemi - 2023 59th Annual Allerton …, 2023 - ieeexplore.ieee.org

A common assumption in most Inverse Reinforcement Learning (IRL) methods is that human
demonstrations are drawn from an optimal policy. However, this assumption poses a …

Save Cite Related articles

[Free GPT-4]

[HTML] tudelft.nl

[HTML][HTML] Safe Online and Offline Reinforcement Learning

TD Simão - 2023 - research.tudelft.nl

Reinforcement Learning (RL) agents can solve general problems based on little to no
knowledge of the underlying environment. These agents learn through experience, using a …

[Free GPT-4]

[PDF] nudt.edu.cn

A population diversity-based robust policy generation method in adversarial game environments# br

S ZHUANG, Y CHEN, Y HAO, W WU… - Computer …, 2024 - joces.nudt.edu.cn

In adversarial game environments, the objective agent aims to generate robust game
policies, ensuring high returns when facing different opponent policies consistently. Existing …

[Free GPT-4]

[PDF] dtic.mil

Control, Learning and Adaptation in Information-Constrained, Adversarial Environments

YE Bayiz, S Carr, ES Crafts, M Cubuktepe, F Djeumou… - 2023 - apps.dtic.mil

We propose to develop a theoretical and algorithmic foundation that will help create
autonomous robotic agents capable of executing patrol missions in urban environments …

Save Cite Related articles View as HTML

[Free GPT-4]

[PDF] nudt.edu.cn

对抗环境中基于种群多样性的鲁棒策略生成方法

庄述鑫，陈永红，郝一行，吴巍炜，徐学永… - 计算机工程与 …, 2024 - joces.nudt.edu.cn

在对抗博弈环境中, 目标智能体希望生成具有高鲁棒性的博弈策略, 使得目标智能体在面对不同
对手策略时, 始终具有较高的收益. 现有的基于自我博弈的策略生成方法通常会过拟合到针对 …

Create alert

Cite

Advanced search

Saved to My library

Multiple Plans are Better than One: Diverse Stochastic Planning

Quality-similar diversity via population based reinforcement learning

Reinforcement learning by guided safe exploration

[PDF][PDF] Training and transferring safe policies in reinforcement learning

Human-Modeling in Sequential Decision-Making: An Analysis through the Lens of Human-Aware AI

Inverse Reinforcement Learning with Learning and Leveraging Demonstrators' Varying Expertise Levels

[HTML][HTML] Safe Online and Offline Reinforcement Learning

A population diversity-based robust policy generation method in adversarial game environments# br

Control, Learning and Adaptation in Information-Constrained, Adversarial Environments

对抗环境中基于种群多样性的鲁棒策略生成方法