A survey of multi-objective sequential decision-making

DM Roijers, P Vamplew, S Whiteson… - Journal of Artificial …, 2013 - jair.org
Sequential decision-making problems with multiple objectives arise naturally in practice and
pose unique challenges for research in decision-theoretic planning and learning, which has …

Multiobjective reinforcement learning: A comprehensive overview

C Liu, X Xu, D Hu - IEEE Transactions on Systems, Man, and …, 2014 - ieeexplore.ieee.org
Reinforcement learning (RL) is a powerful paradigm for sequential decision-making under
uncertainties, and most RL algorithms aim to maximize some numerical value which …

Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2023 - proceedings.neurips.cc
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …

A practical guide to multi-objective reinforcement learning and planning

CF Hayes, R Rădulescu, E Bargiacchi… - Autonomous Agents and …, 2022 - Springer
Real-world sequential decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of research in …

Scalar reward is not enough: A response to silver, singh, precup and sutton (2021)

P Vamplew, BJ Smith, J Källström, G Ramos… - Autonomous Agents and …, 2022 - Springer
The recent paper “Reward is Enough” by Silver, Singh, Precup and Sutton posits that the
concept of reward maximisation is sufficient to underpin all intelligence, both natural and …

[PDF][PDF] Multi-objective reinforcement learning using sets of pareto dominating policies

K Van Moffaert, A Nowé - The Journal of Machine Learning Research, 2014 - jmlr.org
Many real-world problems involve the optimization of multiple, possibly conflicting
objectives. Multi-objective reinforcement learning (MORL) is a generalization of standard …

Empirical evaluation methods for multiobjective reinforcement learning algorithms

P Vamplew, R Dazeley, A Berry, R Issabekov… - Machine learning, 2011 - Springer
While a number of algorithms for multiobjective reinforcement learning have been proposed,
and a small number of applications developed, there has been very little rigorous empirical …

Human-aligned artificial intelligence is a multiobjective problem

P Vamplew, R Dazeley, C Foale, S Firmin… - Ethics and information …, 2018 - Springer
As the capabilities of artificial intelligence (AI) systems improve, it becomes important to
constrain their actions to ensure their behaviour remains beneficial to humanity. A variety of …

A multi-objective deep reinforcement learning framework

TT Nguyen, ND Nguyen, P Vamplew… - … Applications of Artificial …, 2020 - Elsevier
This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL)
framework based on deep Q-networks. We develop a high-performance MODRL framework …

A novel cache architecture with enhanced performance and security

Z Wang, RB Lee - … 41st IEEE/ACM International Symposium on …, 2008 - ieeexplore.ieee.org
Caches ideally should have low miss rates and short access times, and should be power
efficient at the same time. Such design goals are often contradictory in practice. Recent …