A state‐of‐the‐art review of optimal reservoir control for managing conflicting demands in a changing world
The state of the art for optimal water reservoir operations is rapidly evolving, driven by
emerging societal challenges. Changing values for balancing environmental resources …
emerging societal challenges. Changing values for balancing environmental resources …
Explainable AI over the Internet of Things (IoT): Overview, state-of-the-art and future directions
SK Jagatheesaperumal, QV Pham… - IEEE Open Journal …, 2022 - ieeexplore.ieee.org
Explainable Artificial Intelligence (XAI) is transforming the field of Artificial Intelligence (AI) by
enhancing the trust of end-users in machines. As the number of connected devices keeps on …
enhancing the trust of end-users in machines. As the number of connected devices keeps on …
Rewarded soups: towards pareto-optimal alignment by interpolating weights fine-tuned on diverse rewards
Foundation models are first pre-trained on vast unsupervised datasets and then fine-tuned
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …
on labeled data. Reinforcement learning, notably from human feedback (RLHF), can further …
Multi-objective gflownets
M Jain, SC Raparthy… - International …, 2023 - proceedings.mlr.press
We study the problem of generating diverse candidates in the context of Multi-Objective
Optimization. In many applications of machine learning such as drug discovery and material …
Optimization. In many applications of machine learning such as drug discovery and material …
Personalized soups: Personalized large language model alignment via post-hoc parameter merging
While Reinforcement Learning from Human Feedback (RLHF) aligns Large Language
Models (LLMs) with general, aggregate human preferences, it is suboptimal for learning …
Models (LLMs) with general, aggregate human preferences, it is suboptimal for learning …
Scalar reward is not enough: A response to silver, singh, precup and sutton (2021)
The recent paper “Reward is Enough” by Silver, Singh, Precup and Sutton posits that the
concept of reward maximisation is sufficient to underpin all intelligence, both natural and …
concept of reward maximisation is sufficient to underpin all intelligence, both natural and …
Goals, usefulness and abstraction in value-based choice
B De Martino, A Cortese - Trends in Cognitive Sciences, 2023 - cell.com
Abstract Colombian drug lord Pablo Escobar, while on the run, purportedly burned two
million dollars in banknotes to keep his daughter warm. A stark reminder that, in life …
million dollars in banknotes to keep his daughter warm. A stark reminder that, in life …
A review of deep reinforcement learning approaches for smart manufacturing in industry 4.0 and 5.0 framework
A del Real Torres, DS Andreiana, Á Ojeda Roldán… - Applied Sciences, 2022 - mdpi.com
In this review, the industry's current issues regarding intelligent manufacture are presented.
This work presents the status and the potential for the I4. 0 and I5. 0's revolutionary …
This work presents the status and the potential for the I4. 0 and I5. 0's revolutionary …
Optimistic linear support and successor features as a basis for optimal policy transfer
In many real-world applications, reinforcement learning (RL) agents might have to solve
multiple tasks, each one typically modeled via a reward function. If reward functions are …
multiple tasks, each one typically modeled via a reward function. If reward functions are …
Promptable behaviors: Personalizing multi-objective rewards from human preferences
Customizing robotic behaviors to be aligned with diverse human preferences is an
underexplored challenge in the field of embodied AI. In this paper we present Promptable …
underexplored challenge in the field of embodied AI. In this paper we present Promptable …