Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Universal off-policy evaluation
When faced with sequential decision-making problems, it is often useful to be able to predict
what would happen if decisions were made using a new policy. Those predictions must …
what would happen if decisions were made using a new policy. Those predictions must …
Subgaussian and differentiable importance sampling for off-policy evaluation and learning
Importance Sampling (IS) is a widely used building block for a large variety of off-policy
estimation and learning algorithms. However, empirical and theoretical studies have …
estimation and learning algorithms. However, empirical and theoretical studies have …
Offline reinforcement learning with closed-form policy improvement operators
Behavior constrained policy optimization has been demonstrated to be a successful
paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a …
paradigm for tackling Offline Reinforcement Learning. By exploiting historical transitions, a …
Off-policy evaluation with deficient support using side information
Abstract The Off-Policy Evaluation (OPE) problem consists in evaluating the performance of
new policies from the data collected by another one. OPE is crucial when evaluating a new …
new policies from the data collected by another one. OPE is crucial when evaluating a new …
[HTML][HTML] Identification of efficient sampling techniques for probabilistic voltage stability analysis of renewable-rich power systems
This paper presents a comparative analysis of six sampling techniques to identify an efficient
and accurate sampling technique to be applied to probabilistic voltage stability assessment …
and accurate sampling technique to be applied to probabilistic voltage stability assessment …
Inferring smooth control: Monte carlo posterior policy iteration with gaussian processes
Monte Carlo methods have become increasingly relevant for control of non-differentiable
systems, approximate dynamics models, and learning from data. These methods scale to …
systems, approximate dynamics models, and learning from data. These methods scale to …
IWDA: Importance weighting for drift adaptation in streaming supervised learning problems
Distribution drift is an important issue for practical applications of machine learning (ML). In
particular, in streaming ML, the data distribution may change over time, yielding the problem …
particular, in streaming ML, the data distribution may change over time, yielding the problem …
Research on data-driven optimal scheduling of power system
J Luo, W Zhang, H Wang, W Wei, J He - Energies, 2023 - mdpi.com
The uncertainty of output makes it difficult to effectively solve the economic security
dispatching problem of the power grid when a high proportion of renewable energy …
dispatching problem of the power grid when a high proportion of renewable energy …
AutoOPE: Automated Off-Policy Estimator Selection
The Off-Policy Evaluation (OPE) problem consists of evaluating the performance of
counterfactual policies with data collected by another one. This problem is of utmost …
counterfactual policies with data collected by another one. This problem is of utmost …
Training recommenders over large item corpus with importance sampling
By predicting a personalized ranking on a set of items, item recommendation helps users
determine the information they need. While optimizing a ranking-focused loss is more in line …
determine the information they need. While optimizing a ranking-focused loss is more in line …