A survey of deep RL and IL for autonomous driving policy learning

Z Zhu, H Zhao - IEEE Transactions on Intelligent Transportation …, 2021 - ieeexplore.ieee.org
Autonomous driving (AD) agents generate driving policies based on online perception
results, which are obtained at multiple levels of abstraction, eg, behavior planning, motion …

Learning multimodal rewards from rankings

V Myers, E Biyik, N Anari… - Conference on robot …, 2022 - proceedings.mlr.press
Learning from human feedback has shown to be a useful approach in acquiring robot
reward functions. However, expert feedback is often assumed to be drawn from an …

Deep generative models for offline policy learning: Tutorial, survey, and perspectives on future directions

J Chen, B Ganguly, Y Xu, Y Mei, T Lan… - arxiv preprint arxiv …, 2024 - arxiv.org
Deep generative models (DGMs) have demonstrated great success across various domains,
particularly in generating texts, images, and videos using models trained from offline data …

Ess-infogail: Semi-supervised imitation learning from imbalanced demonstrations

H Fu, K Tang, Y Lu, Y Qi, G Deng… - Advances in Neural …, 2024 - proceedings.neurips.cc
Imitation learning aims to reproduce expert behaviors without relying on an explicit reward
signal. However, real-world demonstrations often present challenges, such as multi-modal …

Adversarial option-aware hierarchical imitation learning

M **g, W Huang, F Sun, X Ma… - International …, 2021 - proceedings.mlr.press
It has been a challenge to learning skills for an agent from long-horizon unannotated
demonstrations. Existing approaches like Hierarchical Imitation Learning (HIL) are prone to …

Lane change decision prediction: an efficient BO-XGB modelling approach with SHAP analysis

H Sun, Q Cheng, P Wang, Y Huang… - … A: Transport Science, 2024 - Taylor & Francis
The lane-change decision (LCD) is a critical aspect of driving behaviour. This study
proposes an LCD model based on a Bayesian optimization (BO) framework and extreme …

A dynamic test scenario generation method for autonomous vehicles based on conditional generative adversarial imitation learning

L Jia, D Yang, Y Ren, C Qian, Q Feng, B Sun… - Accident Analysis & …, 2024 - Elsevier
Autonomous vehicles must be comprehensively evaluated before deployed in cities and
highways. However, most existing evaluation approaches for autonomous vehicles are static …

Data-Driven Policy Learning Methods from Biological Behavior: A Systematic Review

Y Wang, M Hayashibe, D Owaki - Applied Sciences, 2024 - mdpi.com
Policy learning enables agents to learn how to map states to actions, thus enabling adaptive
and flexible behavioral generation in complex environments. Policy learning methods are …

RTA-IR: A runtime assurance framework for behavior planning based on imitation learning and responsibility-sensitive safety model

Y Peng, G Tan, H Si - Expert Systems with Applications, 2023 - Elsevier
Current research on artificial intelligence (AI) algorithms in safety–critical areas remains
extremely challenging due to their inability to be fully verified at design time. In this paper …

Hierarchical Imitation Learning for Stochastic Environments

M Igl, P Shah, P Mougin, S Srinivasan… - 2023 IEEE/RSJ …, 2023 - ieeexplore.ieee.org
Many applications of imitation learning require the agent to generate the full distribution of
behaviour observed in the training data. For example, to evaluate the safety of autonomous …