- Academic Search

J Leike, D Krueger, T Everitt, M Martic, V Maini… - ar** out the Space of Human Feedback for Reinforcement Learning: A Conceptual Framework

Y Metz, D Lindner, R Baur, M El-Assady - arxiv preprint arxiv:2411.11761, 2024 - arxiv.org

Reinforcement Learning from Human feedback (RLHF) has become a powerful tool to fine-
tune or train agentic machine learning models. Similar to how humans interact in social …

Salva Cita Articoli correlati Tutte e 2 le versioni Versione HTML

[Free GPT-4]

[PDF] nsf.gov

Instructed reinforcement learning control of safe autonomous j-turn vehicle maneuvers

A Arab, J Yi - 2021 IEEE/ASME International Conference on …, 2021 - ieeexplore.ieee.org

This paper presents a safe control policy search for autonomous J-turn maneuvers inspired
by professional car drivers. These drivers have been highly trained to achieve highly …

Salva Cita Citato da 7 Articoli correlati Tutte e 5 le versioni

[Free GPT-4]

[PDF] rutgers.edu

Safe motion control and planning for autonomous racing vehicles

A Arab - 2021 - search.proquest.com

In the future of the autonomous car industry saving lives might depend on more demanding
maneuvers than what the average drivers know how to do. Professional race car drivers, as …

Salva Cita Citato da 4 Articoli correlati Tutte e 2 le versioni

Taming the Sample Complexity in Agentifying AI Systems by the Exploitation of Explicit Human Knowledge

L Guan - 2024 - search.proquest.com

Extensive efforts have been dedicated to the development of AI agents that can
independently carry out sequential decision-making tasks. Learning-based solutions …

Salva Cita Articoli correlati Tutte e 2 le versioni

Guidance Priors to Reduce Human Feedback Burden in Sequential Decision Making

M Verma - 2024 - search.proquest.com

Human in the loop sequential decision making such as Reinforcement Learning from
Human Feedback (RLHF) or behavior synthesis leverages human feedback to the AI system …

Salva Cita Articoli correlati Tutte e 2 le versioni

Crea avviso

Cita

Ricerca avanzata

Salvato in La mia biblioteca

Bridging the gap: Converting human advice into imagined examples

Scalable agent alignment via reward modeling: a research direction

Instructed reinforcement learning control of safe autonomous j-turn vehicle maneuvers

Safe motion control and planning for autonomous racing vehicles

Taming the Sample Complexity in Agentifying AI Systems by the Exploitation of Explicit Human Knowledge

Guidance Priors to Reduce Human Feedback Burden in Sequential Decision Making