Modeling recommender ecosystems: Research challenges at the intersection of mechanism design, reinforcement learning and generative models

C Boutilier, M Mladenov, G Tennenholtz - arxiv preprint arxiv:2309.06375, 2023 - arxiv.org
Modern recommender systems lie at the heart of complex ecosystems that couple the
behavior of users, content providers, advertisers, and other actors. Despite this, the focus of …

Parl: A unified framework for policy alignment in reinforcement learning from human feedback

S Chakraborty, AS Bedi, A Koppel, D Manocha… - arxiv preprint arxiv …, 2023 - arxiv.org
We present a novel unified bilevel optimization-based framework,\textsf {PARL}, formulated
to address the recently highlighted critical issue of policy alignment in reinforcement …

Stride: A tool-assisted llm agent framework for strategic and interactive decision-making

C Li, R Yang, T Li, M Bafarassat, K Sharifi… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) like GPT-4 have revolutionized natural language
processing, showing remarkable linguistic proficiency and reasoning capabilities. However …

Automated design of affine maximizer mechanisms in dynamic settings

M Curry, V Thoma, D Chakrabarti, S McAleer… - Proceedings of the …, 2024 - ojs.aaai.org
Dynamic mechanism design is a challenging extension to ordinary mechanism design in
which the mechanism designer must make a sequence of decisions over time in the face of …

Pessimism meets vcg: Learning dynamic mechanism design via offline reinforcement learning

B Lyu, Z Wang, M Kolar, Z Yang - … Conference on Machine …, 2022 - proceedings.mlr.press
Dynamic mechanism design has garnered significant attention from both computer scientists
and economists in recent years. By allowing agents to interact with the seller over multiple …

Optimal Mechanism Design for Sequential Decision Making Processes

B Lyu - 2024 - search.proquest.com
For the dissertation, we propose studying efficiently learning the optimal dynamic
mechanism when the agents' valuations can be characterized by a Markov Decision …

Principal-Driven Reward Design and Agent Policy Alignment via Bilevel-RL

In reinforcement learning (RL), a reward function is often assumed at the outset of a policy
optimization procedure. Learning in such a fixed reward paradigm in RL can neglect …