محقق Google

P Geibel, F Wysotzki - Journal of Artificial Intelligence Research, 2005‏ - jair.org‏

In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states
are those states entering which is undesirable or dangerous. We define the risk with respect …‏

ذخیره ارجاع بیان شده در 446 یافته مقاله‌های مربوط تمام نسخه‌های 18 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Hyperbolic discounting and learning over multiple horizons‏

W Fedus, C Gelada, Y Bengio, MG Bellemare… - arxiv preprint arxiv …, 2019‏ - arxiv.org‏

Reinforcement learning (RL) typically defines a discount factor as part of the Markov
Decision Process. The discount factor values future rewards by an exponential scheme that …‏

ذخیره ارجاع بیان شده در 129 یافته مقاله‌های مربوط تمام نسخه‌های 3 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Accelerated primal-dual policy optimization for safe reinforcement learning‏

Q Liang, F Que, E Modiano - arxiv preprint arxiv:1802.06480, 2018‏ - arxiv.org‏

Constrained Markov Decision Process (CMDP) is a natural framework for reinforcement
learning tasks with safety constraints, where agents learn a policy that maximizes the long …‏

ذخیره ارجاع بیان شده در 127 یافته مقاله‌های مربوط تمام نسخه‌های 2 نسخه HTML

Constrained discounted Markov decision processes and Hamiltonian cycles‏

EA Feinberg - Mathematics of Operations Research, 2000‏ - pubsonline.informs.org‏

This paper establishes new links between stochastic and discrete optimization. We consider
the following three problems for discrete time Markov Decision Processes with finite states …‏

ذخیره ارجاع بیان شده در 118 یافته مقاله‌های مربوط تمام نسخه‌های 8

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

[PDF][PDF] Stationary deterministic policies for constrained MDPs with multiple rewards, costs, and discount factors‏

DA Dolgov, EH Durfee - IJCAI, 2005‏ - Citeseer‏

We consider the problem of policy optimization for a resource-limited agent with multiple
timedependent objectives, represented as an MDP with multiple discount factors in the …‏

ذخیره ارجاع بیان شده در 91 یافته مقاله‌های مربوط تمام نسخه‌های 10 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

[PDF][PDF] Markov decision processes‏

L Kallenberg - Lecture Notes. University of Leiden, 2011‏ - researchgate.net‏

Branching out from operations research roots of the 1950's, Markov decision processes
(MDPs) have gained recognition in such diverse fields as economics, telecommunication …‏

ذخیره ارجاع بیان شده در 81 یافته مقاله‌های مربوط تمام نسخه‌های 5 نسخه HTML

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Constrained reinforcement learning from intrinsic and extrinsic rewards‏

E Uchibe, K Doya - 2007 IEEE 6th International Conference on …, 2007‏ - ieeexplore.ieee.org‏

The main objective of a standard reinforcement learner is usually defined as maximization of
a scalar reward function given externally from the environment. On the other hand, an …‏

ذخیره ارجاع بیان شده در 74 یافته مقاله‌های مربوط تمام نسخه‌های 10

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

Constrained average cost Markov control processes in Borel spaces‏

O Hernández-Lerma, J González-Hernández… - SIAM Journal on Control …, 2003‏ - SIAM‏

This paper considers constrained Markov control processes in Borel spaces, with
unbounded costs. The criterion to be minimized is a long-run expected average cost, and …‏

ذخیره ارجاع بیان شده در 87 یافته مقاله‌های مربوط تمام نسخه‌های 6

Controlled random sequences: methods of convex analysis and problems with functional constraints‏

AB Piunovskii - Russian Mathematical Surveys, 1998‏ - iopscience.iop.org‏

Abstract Contents Introduction § 1. Controlled random sequences: main definitions and
traditional approaches § 1.1. Description of the mathematical model § 1.2. Models with …‏

ذخیره ارجاع بیان شده در 27 یافته مقاله‌های مربوط تمام نسخه‌های 9

[Free GPT-4]
[DeepSeek]

[PDF] academia.edu

Constrained Markov control processes in Borel spaces: the discounted case‏

O Hernández-Lerma… - Mathematical Methods of …, 2000‏ - Springer‏

We consider constrained discounted-cost Markov control processes in Borel spaces, with
unbounded costs. Conditions are given for the constrained problem to be solvable, and also …‏

ذخیره ارجاع بیان شده در 75 یافته مقاله‌های مربوط تمام نسخه‌های 12

ایجاد هشدار

ارجاع

جستجوی پیشرفته

در «کتابخانه من» ذخیره شد

Constrained dynamic programming with two discount factors: applications and an algorithm

Risk-sensitive reinforcement learning applied to control under constraints‏

Hyperbolic discounting and learning over multiple horizons‏

Accelerated primal-dual policy optimization for safe reinforcement learning‏

Constrained discounted Markov decision processes and Hamiltonian cycles‏

[PDF][PDF] Stationary deterministic policies for constrained MDPs with multiple rewards, costs, and discount factors‏

[PDF][PDF] Markov decision processes‏

Constrained reinforcement learning from intrinsic and extrinsic rewards‏

Constrained average cost Markov control processes in Borel spaces‏

Controlled random sequences: methods of convex analysis and problems with functional constraints‏

Constrained Markov control processes in Borel spaces: the discounted case‏