Universal off-policy evaluation

Y Chandak, S Niekum, B da Silva… - Advances in …, 2021 - proceedings.neurips.cc
When faced with sequential decision-making problems, it is often useful to be able to predict
what would happen if decisions were made using a new policy. Those predictions must …

Subgaussian and differentiable importance sampling for off-policy evaluation and learning

AM Metelli, A Russo, M Restelli - Advances in neural …, 2021 - proceedings.neurips.cc
Importance Sampling (IS) is a widely used building block for a large variety of off-policy
estimation and learning algorithms. However, empirical and theoretical studies have …

Offline reinforcement learning with closed-form policy improvement operators

J Li, E Zhang, M Yin, Q Bai, YX Wang… - … on Machine Learning, 2023 - proceedings.mlr.press
Behavior constrained policy optimization has been demonstrated to be a successful
paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a …

Off-policy evaluation with deficient support using side information

N Felicioni, M Ferrari Dacrema… - Advances in …, 2022 - proceedings.neurips.cc
Abstract The Off-Policy Evaluation (OPE) problem consists in evaluating the performance of
new policies from the data collected by another one. OPE is crucial when evaluating a new …

[HTML][HTML] Identification of efficient sampling techniques for probabilistic voltage stability analysis of renewable-rich power systems

M Alzubaidi, KN Hasan, L Meegahapola, MT Rahman - Energies, 2021 - mdpi.com
This paper presents a comparative analysis of six sampling techniques to identify an efficient
and accurate sampling technique to be applied to probabilistic voltage stability assessment …

Inferring smooth control: Monte carlo posterior policy iteration with gaussian processes

J Watson, J Peters - Conference on Robot Learning, 2023 - proceedings.mlr.press
Monte Carlo methods have become increasingly relevant for control of non-differentiable
systems, approximate dynamics models, and learning from data. These methods scale to …

IWDA: Importance weighting for drift adaptation in streaming supervised learning problems

F Fedeli, AM Metelli, F Trovo… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Distribution drift is an important issue for practical applications of machine learning (ML). In
particular, in streaming ML, the data distribution may change over time, yielding the problem …

Research on data-driven optimal scheduling of power system

J Luo, W Zhang, H Wang, W Wei, J He - Energies, 2023 - mdpi.com
The uncertainty of output makes it difficult to effectively solve the economic security
dispatching problem of the power grid when a high proportion of renewable energy …

AutoOPE: Automated Off-Policy Estimator Selection

N Felicioni, M Benigni, MF Dacrema - arxiv preprint arxiv:2406.18022, 2024 - arxiv.org
The Off-Policy Evaluation (OPE) problem consists of evaluating the performance of
counterfactual policies with data collected by another one. This problem is of utmost …

Training recommenders over large item corpus with importance sampling

D Lian, Z Gao, X Song, Y Li, Q Liu… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
By predicting a personalized ranking on a set of items, item recommendation helps users
determine the information they need. While optimizing a ranking-focused loss is more in line …