Provable benefits of policy learning from human preferences in contextual bandit problems

X Ji, H Wang, M Chen, T Zhao, M Wang - arxiv preprint arxiv:2307.12975, 2023 - arxiv.org
For a real-world decision-making problem, the reward function often needs to be engineered
or learned. A popular approach is to utilize human feedback to learn a reward function for …

Crowdsourced top-k queries by pairwise preference judgments with confidence and budget control

Y Li, H Wang, NM Kou, LH U, Z Gong - The VLDB Journal, 2021 - Springer
Crowdsourced query processing is an emerging technique that tackles computationally
challenging problems by human intelligence. The basic idea is to decompose a …

Efficient crowdsourced best objects finding via superiority probability based ordering for decision support systems

B Yin, W Zeng, X Wei - Expert Systems with Applications, 2023 - Elsevier
Best objects finding is a fundamental operation in decision support systems and
applications. When numerical values of objects cannot be obtained from existing computer …

Learning from ranking data: theory and methods

A Korba - 2018 - pastel.hal.science
Ranking data, ie, ordered list of items, naturally appears in a wide variety of situations,
especially when the data comes from human activities (ballots in political elections, survey …

15 Generalized Low-Rank Optimization for Ultra-dense Fog-RANs

Y Shi, K Yang, Y Yang - Ultra-Dense Networks: Principles and …, 2020 - cambridge.org
Expectations for new wireless networks have become higher since mobile data has grown
exponentially and more diverse user services have emerged. Intensive deployment of …