Pure exploration with multiple correct answers
We determine the sample complexity of pure exploration bandit problems with multiple good
answers. We derive a lower bound using a new game equilibrium argument. We show how …
answers. We derive a lower bound using a new game equilibrium argument. We show how …
Partially observable total-cost Markov decision processes with weakly continuous transition probabilities
This paper describes sufficient conditions for the existence of optimal policies for partially
observable Markov decision processes (POMDPs) with Borel state, observation, and action …
observable Markov decision processes (POMDPs) with Borel state, observation, and action …
On the feasibility and continuity of feedback controllers defined by multiple control barrier functions
Control barrier functions are a popular method for encoding safety specifications for
dynamical systems. In this paper, a notion of control barrier function is defined that permits …
dynamical systems. In this paper, a notion of control barrier function is defined that permits …
Optimality conditions for inventory control
EA Feinberg - … Challenges in Complex, Networked and Risky …, 2016 - pubsonline.informs.org
This tutorial describes recently developed general optimality conditions for Markov decision
processes that have significant applications to inventory control. In particular, these …
processes that have significant applications to inventory control. In particular, these …
Learning optimal antenna tilt control policies: A contextual linear bandits approach
Controlling antenna tilts in cellular networks is critical to achieve a good trade-off between
network coverage and capacity. We devise algorithms learning optimal tilt control policies …
network coverage and capacity. We devise algorithms learning optimal tilt control policies …
Continuity of discounted values and the structure of optimal policies for periodic‐review inventory systems with setup costs
EA Feinberg, DN Kraemer - Naval Research Logistics (NRL), 2023 - Wiley Online Library
This paper proves continuity of value functions in discounted periodic‐review single‐
commodity total‐cost inventory control problems with continuous inventory levels, fixed …
commodity total‐cost inventory control problems with continuous inventory levels, fixed …
Convergence of probability measures and Markov decision models with incomplete information
This paper deals with three major types of convergence of probability measures on metric
spaces: weak convergence, setwise convergence, and convergence in total variation. First, it …
spaces: weak convergence, setwise convergence, and convergence in total variation. First, it …
Saturated total-population dependent branching process and viral markets
Interesting posts are continually forwarded by the users of the online social network (OSN).
Such propagation leads to re-forwarding of the post to some of the previous recipients …
Such propagation leads to re-forwarding of the post to some of the previous recipients …
Structure of optimal policies to periodic-review inventory models with convex costs and backorders for all values of discount factors
This paper describes the structure of optimal policies for discounted periodic-review single-
commodity total-cost inventory control problems with fixed ordering costs for finite and …
commodity total-cost inventory control problems with fixed ordering costs for finite and …
On the convergence of optimal actions for Markov decision processes and the optimality of (s, S) inventory policies
This article studies convergence properties of optimal values and actions for discounted and
average‐cost Markov decision processes (MDPs) with weakly continuous transition …
average‐cost Markov decision processes (MDPs) with weakly continuous transition …