Hypervolume maximization: A geometric view of pareto set learning
This paper presents a novel approach to multiobjective algorithms aimed at modeling the
Pareto set using neural networks. Whereas previous methods mainly focused on identifying …
Pareto set using neural networks. Whereas previous methods mainly focused on identifying …
Train once, get a family: State-adaptive balances for offline-to-online reinforcement learning
Offline-to-online reinforcement learning (RL) is a training paradigm that combines pre-
training on a pre-collected dataset with fine-tuning in an online environment. However, the …
training on a pre-collected dataset with fine-tuning in an online environment. However, the …
Anti-exploration by random network distillation
Despite the success of Random Network Distillation (RND) in various domains, it was shown
as not discriminative enough to be used as an uncertainty estimator for penalizing out-of …
as not discriminative enough to be used as an uncertainty estimator for penalizing out-of …
Uni-o4: Unifying online and offline deep reinforcement learning with multi-step on-policy optimization
Combining offline and online reinforcement learning (RL) is crucial for efficient and safe
learning. However, previous approaches treat offline and online learning as separate …
learning. However, previous approaches treat offline and online learning as separate …
Proto: Iterative policy regularized offline-to-online reinforcement learning
Offline-to-online reinforcement learning (RL), by combining the benefits of offline pretraining
and online finetuning, promises enhanced sample efficiency and policy performance …
and online finetuning, promises enhanced sample efficiency and policy performance …
Model-based offline reinforcement learning with count-based conservatism
In this paper, we present a model-based offline reinforcement learning method that
integrates count-based conservatism, named $\texttt {Count-MORL} $. Our method utilizes …
integrates count-based conservatism, named $\texttt {Count-MORL} $. Our method utilizes …
Energy Management for a DM-i Plug-in Hybrid Electric Vehicle via Continuous-Discrete Reinforcement Learning
C Gong, J Xu, Y Lin - arxiv preprint arxiv:2306.08823, 2023 - arxiv.org
Energy management strategy (EMS) is a key technology for plug-in hybrid electric vehicles
(PHEVs). The energy management of PHEVs needs to output continuous variables such as …
(PHEVs). The energy management of PHEVs needs to output continuous variables such as …
Exploration and anti-exploration with distributional random network distillation
Exploration remains a critical issue in deep reinforcement learning for an agent to attain high
returns in unknown environments. Although the prevailing exploration Random Network …
returns in unknown environments. Although the prevailing exploration Random Network …
A simple unified uncertainty-guided framework for offline-to-online reinforcement learning
Offline reinforcement learning (RL) provides a promising solution to learning an agent fully
relying on a data-driven paradigm. However, constrained by the limited quality of the offline …
relying on a data-driven paradigm. However, constrained by the limited quality of the offline …
Preference-Optimized Pareto Set Learning for Blackbox Optimization
Multi-Objective Optimization (MOO) is an important problem in real-world applications.
However, for a non-trivial problem, no single solution exists that can optimize all the …
However, for a non-trivial problem, no single solution exists that can optimize all the …