Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Risk-sensitive reinforcement learning applied to control under constraints
P Geibel, F Wysotzki - Journal of Artificial Intelligence Research, 2005 - jair.org
In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states
are those states entering which is undesirable or dangerous. We define the risk with respect …
are those states entering which is undesirable or dangerous. We define the risk with respect …
Hyperbolic discounting and learning over multiple horizons
Reinforcement learning (RL) typically defines a discount factor as part of the Markov
Decision Process. The discount factor values future rewards by an exponential scheme that …
Decision Process. The discount factor values future rewards by an exponential scheme that …
Accelerated primal-dual policy optimization for safe reinforcement learning
Constrained Markov Decision Process (CMDP) is a natural framework for reinforcement
learning tasks with safety constraints, where agents learn a policy that maximizes the long …
learning tasks with safety constraints, where agents learn a policy that maximizes the long …
Constrained discounted Markov decision processes and Hamiltonian cycles
This paper establishes new links between stochastic and discrete optimization. We consider
the following three problems for discrete time Markov Decision Processes with finite states …
the following three problems for discrete time Markov Decision Processes with finite states …
[PDF][PDF] Stationary deterministic policies for constrained MDPs with multiple rewards, costs, and discount factors
We consider the problem of policy optimization for a resource-limited agent with multiple
timedependent objectives, represented as an MDP with multiple discount factors in the …
timedependent objectives, represented as an MDP with multiple discount factors in the …
[PDF][PDF] Markov decision processes
L Kallenberg - Lecture Notes. University of Leiden, 2011 - researchgate.net
Branching out from operations research roots of the 1950's, Markov decision processes
(MDPs) have gained recognition in such diverse fields as economics, telecommunication …
(MDPs) have gained recognition in such diverse fields as economics, telecommunication …
Constrained reinforcement learning from intrinsic and extrinsic rewards
The main objective of a standard reinforcement learner is usually defined as maximization of
a scalar reward function given externally from the environment. On the other hand, an …
a scalar reward function given externally from the environment. On the other hand, an …
Constrained average cost Markov control processes in Borel spaces
This paper considers constrained Markov control processes in Borel spaces, with
unbounded costs. The criterion to be minimized is a long-run expected average cost, and …
unbounded costs. The criterion to be minimized is a long-run expected average cost, and …
Controlled random sequences: methods of convex analysis and problems with functional constraints
Abstract Contents Introduction § 1. Controlled random sequences: main definitions and
traditional approaches § 1.1. Description of the mathematical model § 1.2. Models with …
traditional approaches § 1.1. Description of the mathematical model § 1.2. Models with …
Constrained Markov control processes in Borel spaces: the discounted case
We consider constrained discounted-cost Markov control processes in Borel spaces, with
unbounded costs. Conditions are given for the constrained problem to be solvable, and also …
unbounded costs. Conditions are given for the constrained problem to be solvable, and also …