Sequential information design: Learning to persuade in the dark
We study a repeated information design problem faced by an informed sender who tries to
influence the behavior of a self-interested receiver. We consider settings where the receiver …
influence the behavior of a self-interested receiver. We consider settings where the receiver …
Safe Linear Bandits over Unknown Polytopes
A Gangrade, T Chen… - The Thirty Seventh Annual …, 2024 - proceedings.mlr.press
The safe linear bandit problem (SLB) is an online approach to linear programming with
unknown objective and unknown\emph {roundwise} constraints, under stochastic bandit …
unknown objective and unknown\emph {roundwise} constraints, under stochastic bandit …
Achieving Regular and Fair Learning in Combinatorial Multi-Armed Bandit
Combinatorial multi-armed bandit refers to the model that aims to maximize cumulative
rewards in the presence of uncertainty. Motivated by two important wireless network …
rewards in the presence of uncertainty. Motivated by two important wireless network …
Learning Adversarial MDPs with Stochastic Hard Constraints
We study online learning problems in constrained Markov decision processes (CMDPs) with
adversarial losses and stochastic hard constraints. We consider two different scenarios. In …
adversarial losses and stochastic hard constraints. We consider two different scenarios. In …
Machine learning to optimize additive manufacturing for visible photonics
Additive manufacturing has become an important tool for fabricating advanced systems and
devices for visible nanophotonics. However, the lack of simulation and optimization methods …
devices for visible nanophotonics. However, the lack of simulation and optimization methods …
Online learning in sequential Bayesian persuasion: Handling unknown priors
We study a repeated information design problem faced by an informed sender who tries to
influence the behavior of a self-interested receiver, through the provision of payoff-relevant …
influence the behavior of a self-interested receiver, through the provision of payoff-relevant …
Doubly-Optimistic Play for Safe Linear Bandits
The safe linear bandit problem (SLB) is an online approach to linear programming with
unknown objective and unknown round-wise constraints, under stochastic bandit feedback …
unknown objective and unknown round-wise constraints, under stochastic bandit feedback …
A General Framework for Safe Decision Making: A Convex Duality Approach
We study the problem of online interaction in general decision making problems, where the
objective is not only to find optimal strategies, but also to satisfy some safety guarantees …
objective is not only to find optimal strategies, but also to satisfy some safety guarantees …
Constrained learning in the bandit setting: doubly optimistic strategies and fast rates
T Chen - 2024 - open.bu.edu
The (stochastic) bandit problem is a classic example used to address the challenge of
balancing exploration and exploitation when dealing with bandit feedback. This dissertation …
balancing exploration and exploitation when dealing with bandit feedback. This dissertation …
Online learning in CMDPs with adversarial losses and stochastic hard constraints
We study online learning in constrained Markov decision processes (CMDPs) with
adversarial losses and stochastic hard constraints, under bandit feedback. We consider two …
adversarial losses and stochastic hard constraints, under bandit feedback. We consider two …