Hypervolume maximization: A geometric view of pareto set learning

X Zhang, X Lin, B Xue, Y Chen… - Advances in Neural …, 2023 - proceedings.neurips.cc
This paper presents a novel approach to multiobjective algorithms aimed at modeling the
Pareto set using neural networks. Whereas previous methods mainly focused on identifying …

Train once, get a family: State-adaptive balances for offline-to-online reinforcement learning

S Wang, Q Yang, J Gao, M Lin… - Advances in …, 2024 - proceedings.neurips.cc
Offline-to-online reinforcement learning (RL) is a training paradigm that combines pre-
training on a pre-collected dataset with fine-tuning in an online environment. However, the …

Anti-exploration by random network distillation

A Nikulin, V Kurenkov, D Tarasov… - … on Machine Learning, 2023 - proceedings.mlr.press
Despite the success of Random Network Distillation (RND) in various domains, it was shown
as not discriminative enough to be used as an uncertainty estimator for penalizing out-of …

Uni-o4: Unifying online and offline deep reinforcement learning with multi-step on-policy optimization

K Lei, Z He, C Lu, K Hu, Y Gao, H Xu - arxiv preprint arxiv:2311.03351, 2023 - arxiv.org
Combining offline and online reinforcement learning (RL) is crucial for efficient and safe
learning. However, previous approaches treat offline and online learning as separate …

Proto: Iterative policy regularized offline-to-online reinforcement learning

J Li, X Hu, H Xu, J Liu, X Zhan, YQ Zhang - arxiv preprint arxiv …, 2023 - arxiv.org
Offline-to-online reinforcement learning (RL), by combining the benefits of offline pretraining
and online finetuning, promises enhanced sample efficiency and policy performance …

Model-based offline reinforcement learning with count-based conservatism

B Kim, M Oh - International Conference on Machine …, 2023 - proceedings.mlr.press
In this paper, we present a model-based offline reinforcement learning method that
integrates count-based conservatism, named $\texttt {Count-MORL} $. Our method utilizes …

Energy Management for a DM-i Plug-in Hybrid Electric Vehicle via Continuous-Discrete Reinforcement Learning

C Gong, J Xu, Y Lin - arxiv preprint arxiv:2306.08823, 2023 - arxiv.org
Energy management strategy (EMS) is a key technology for plug-in hybrid electric vehicles
(PHEVs). The energy management of PHEVs needs to output continuous variables such as …

Exploration and anti-exploration with distributional random network distillation

K Yang, J Tao, J Lyu, X Li - arxiv preprint arxiv:2401.09750, 2024 - arxiv.org
Exploration remains a critical issue in deep reinforcement learning for an agent to attain high
returns in unknown environments. Although the prevailing exploration Random Network …

A simple unified uncertainty-guided framework for offline-to-online reinforcement learning

S Guo, Y Sun, J Hu, S Huang, H Chen, H Piao… - arxiv preprint arxiv …, 2023 - arxiv.org
Offline reinforcement learning (RL) provides a promising solution to learning an agent fully
relying on a data-driven paradigm. However, constrained by the limited quality of the offline …

Preference-Optimized Pareto Set Learning for Blackbox Optimization

Z Haishan, D Das, K Tsuda - arxiv preprint arxiv:2408.09976, 2024 - arxiv.org
Multi-Objective Optimization (MOO) is an important problem in real-world applications.
However, for a non-trivial problem, no single solution exists that can optimize all the …