Google Acadèmic

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

A survey of multi-objective sequential decision-making

DM Roijers, P Vamplew, S Whiteson… - Journal of Artificial …, 2013 - jair.org

Sequential decision-making problems with multiple objectives arise naturally in practice and
pose unique challenges for research in decision-theoretic planning and learning, which has …

Desa Cita Citat per 818 Articles relacionats Totes les 21 versions Free GPT-4 DeepSeek Versió HTML

Multiobjective reinforcement learning: A comprehensive overview

C Liu, X Xu, D Hu - IEEE Transactions on Systems, Man, and …, 2014 - ieeexplore.ieee.org

Reinforcement learning (RL) is a powerful paradigm for sequential decision-making under
uncertainties, and most RL algorithms aim to maximize some numerical value which …

Desa Cita Citat per 451 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A Rame, G Couairon, C Dancette… - Advances in …, 2023 - proceedings.neurips.cc

Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …

Desa Cita Citat per 103 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

A practical guide to multi-objective reinforcement learning and planning

CF Hayes, R Rădulescu, E Bargiacchi… - Autonomous Agents and …, 2022 - Springer

Real-world sequential decision-making tasks are generally complex, requiring trade-offs
between multiple, often conflicting, objectives. Despite this, the majority of research in …

Desa Cita Citat per 403 Articles relacionats Totes les 21 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Scalar reward is not enough: A response to silver, singh, precup and sutton (2021)

P Vamplew, BJ Smith, J Källström, G Ramos… - Autonomous Agents and …, 2022 - Springer

The recent paper “Reward is Enough” by Silver, Singh, Precup and Sutton posits that the
concept of reward maximisation is sufficient to underpin all intelligence, both natural and …

Desa Cita Citat per 92 Articles relacionats Totes les 17 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] jmlr.org

[PDF][PDF] Multi-objective reinforcement learning using sets of pareto dominating policies

K Van Moffaert, A Nowé - The Journal of Machine Learning Research, 2014 - jmlr.org

Many real-world problems involve the optimization of multiple, possibly conflicting
objectives. Multi-objective reinforcement learning (MORL) is a generalization of standard …

Desa Cita Citat per 418 Articles relacionats Totes les 12 versions Free GPT-4 DeepSeek Versió HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Empirical evaluation methods for multiobjective reinforcement learning algorithms

P Vamplew, R Dazeley, A Berry, R Issabekov… - Machine learning, 2011 - Springer

While a number of algorithms for multiobjective reinforcement learning have been proposed,
and a small number of applications developed, there has been very little rigorous empirical …

Desa Cita Citat per 391 Articles relacionats Totes les 14 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

Human-aligned artificial intelligence is a multiobjective problem

P Vamplew, R Dazeley, C Foale, S Firmin… - Ethics and information …, 2018 - Springer

As the capabilities of artificial intelligence (AI) systems improve, it becomes important to
constrain their actions to ensure their behaviour remains beneficial to humanity. A variety of …

Desa Cita Citat per 171 Articles relacionats Totes les 10 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A multi-objective deep reinforcement learning framework

TT Nguyen, ND Nguyen, P Vamplew… - … Applications of Artificial …, 2020 - Elsevier

This paper introduces a new scalable multi-objective deep reinforcement learning (MODRL)
framework based on deep Q-networks. We develop a high-performance MODRL framework …

Desa Cita Citat per 153 Articles relacionats Totes les 9 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] princeton.edu

A novel cache architecture with enhanced performance and security

Z Wang, RB Lee - … 41st IEEE/ACM International Symposium on …, 2008 - ieeexplore.ieee.org

Caches ideally should have low miss rates and short access times, and should be power
efficient at the same time. Such design goals are often contradictory in practice. Recent …

Desa Cita Citat per 363 Articles relacionats Totes les 12 versions Free GPT-4 DeepSeek

Cita

Cerca avançada

S'ha desat a La meva biblioteca

A survey of multi-objective sequential decision-making

Multiobjective reinforcement learning: A comprehensive overview

Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards

A practical guide to multi-objective reinforcement learning and planning

Scalar reward is not enough: A response to silver, singh, precup and sutton (2021)

[PDF][PDF] Multi-objective reinforcement learning using sets of pareto dominating policies

Empirical evaluation methods for multiobjective reinforcement learning algorithms

Human-aligned artificial intelligence is a multiobjective problem

A multi-objective deep reinforcement learning framework

A novel cache architecture with enhanced performance and security