Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Modeling recommender ecosystems: Research challenges at the intersection of mechanism design, reinforcement learning and generative models
Modern recommender systems lie at the heart of complex ecosystems that couple the
behavior of users, content providers, advertisers, and other actors. Despite this, the focus of …
behavior of users, content providers, advertisers, and other actors. Despite this, the focus of …
Parl: A unified framework for policy alignment in reinforcement learning from human feedback
We present a novel unified bilevel optimization-based framework,\textsf {PARL}, formulated
to address the recently highlighted critical issue of policy alignment in reinforcement …
to address the recently highlighted critical issue of policy alignment in reinforcement …
Stride: A tool-assisted llm agent framework for strategic and interactive decision-making
Large Language Models (LLMs) like GPT-4 have revolutionized natural language
processing, showing remarkable linguistic proficiency and reasoning capabilities. However …
processing, showing remarkable linguistic proficiency and reasoning capabilities. However …
Automated design of affine maximizer mechanisms in dynamic settings
Dynamic mechanism design is a challenging extension to ordinary mechanism design in
which the mechanism designer must make a sequence of decisions over time in the face of …
which the mechanism designer must make a sequence of decisions over time in the face of …
Pessimism meets vcg: Learning dynamic mechanism design via offline reinforcement learning
B Lyu, Z Wang, M Kolar, Z Yang - … Conference on Machine …, 2022 - proceedings.mlr.press
Dynamic mechanism design has garnered significant attention from both computer scientists
and economists in recent years. By allowing agents to interact with the seller over multiple …
and economists in recent years. By allowing agents to interact with the seller over multiple …
Optimal Mechanism Design for Sequential Decision Making Processes
B Lyu - 2024 - search.proquest.com
For the dissertation, we propose studying efficiently learning the optimal dynamic
mechanism when the agents' valuations can be characterized by a Markov Decision …
mechanism when the agents' valuations can be characterized by a Markov Decision …
Principal-Driven Reward Design and Agent Policy Alignment via Bilevel-RL
In reinforcement learning (RL), a reward function is often assumed at the outset of a policy
optimization procedure. Learning in such a fixed reward paradigm in RL can neglect …
optimization procedure. Learning in such a fixed reward paradigm in RL can neglect …