Spectral entry-wise matrix estimation for low-rank reinforcement learning

S Stojanovic, Y Jedra… - Advances in Neural …, 2023 - proceedings.neurips.cc
We study matrix estimation problems arising in reinforcement learning with low-rank
structure. In low-rank bandits, the matrix to be recovered specifies the expected arm …

Optimal algorithms for latent bandits with cluster structure

S Pal, AS Suggala, K Shanmugam… - … Conference on Artificial …, 2023 - proceedings.mlr.press
We consider the problem of latent bandits with cluster structure where there are multiple
users, each with an associated multi-armed bandit problem. These users are grouped into …

Online matrix completion: A collaborative approach with hott items

D Baby, S Pal - arxiv preprint arxiv:2408.05843, 2024 - arxiv.org
We investigate the low rank matrix completion problem in an online setting with ${M} $
users, ${N} $ items, ${T} $ rounds, and an unknown rank-$ r $ reward matrix ${R}\in\mathbb …

Blocked collaborative bandits: online collaborative filtering with per-item budget constraints

S Pal, A Suggala, K Shanmugam… - Advances in Neural …, 2024 - proceedings.neurips.cc
We consider the problem of\emph {blocked} collaborative bandits where there are multiple
users, each with an associated multi-armed bandit problem. These users are grouped …

Multi-user reinforcement learning with low rank rewards

DM Nagaraj, SS Kowshik, N Agarwal… - International …, 2023 - proceedings.mlr.press
We consider collaborative multi-user reinforcement learning, where multiple users have the
same state-action space and transition probabilities but different rewards. Under the …

A scalable recommendation engine for new users and items

B Xu, Y Deng, C Mela - arxiv preprint arxiv:2209.06128, 2022 - arxiv.org
In many digital contexts such as online news and e-tailing with many new users and items,
recommendation systems face several challenges: i) how to make initial recommendations …

Multi-User Reinforcement Learning with Low Rank Rewards

N Agarwal, P Jain, S Kowshik, D Nagaraj… - arxiv preprint arxiv …, 2022 - arxiv.org
In this work, we consider the problem of collaborative multi-user reinforcement learning. In
this setting there are multiple users with the same state-action space and transition …

[PDF][PDF] Improving Mobile Maternal and Child Health Care Programs: Collaborative Bandits for Time Slot Selection.

S Pal, M Tambe, AS Suggala, K Shanmugam… - …, 2024 - teamcore.seas.harvard.edu
Maternal mortality is unacceptably high in several parts of the world. In 2020, an estimated
287,000 women died from preventable causes related to pregnancy and childbirth [30] …

Match Made with Matrix Completion: Efficient Offline and Online Learning in Matching Markets

Z Tang, W Chen, K Xu - Available at SSRN 4976903, 2024 - papers.ssrn.com
Online matching markets face increasing needs to accurately learn the matching qualities
between demand and supply for effective design of matching policies. However, the growing …

Online Algorithms and Beyond Worst-Case Learning

AK Ruwanpathirana - 2024 - search.proquest.com
This dissertation investigates online algorithms and beyond worst-case learning, focusing
on an array of problems that showcase the applicability of online algorithms to derive robust …