- Academic Search

B Zhu, M Jordan, J Jiao - International Conference on …, 2023 - proceedings.mlr.press

We provide a theoretical framework for Reinforcement Learning with Human Feedback
(RLHF). We show that when the underlying true reward is linear, under both Bradley-Terry …

Enregistrer Citer Cité 177 fois Autres articles Les 8 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] tsinghua.edu.cn

Crowdsourced data management: A survey

G Li, J Wang, Y Zheng… - IEEE Transactions on …, 2016 - ieeexplore.ieee.org

Any important data management and analytics tasks cannot be completely addressed by
automated processes. These tasks, such as entity resolution, sentiment analysis, and image …

Enregistrer Citer Cité 405 fois Autres articles Les 17 versions Free GPT-4

[LIVRE][B] Communication Complexity: and Applications

A Rao, A Yehudayoff - 2020 - books.google.com

Communication complexity is the mathematical study of scenarios where several parties
need to communicate to achieve a common goal, a situation that naturally appears during …

Enregistrer Citer Cité 184 fois Autres articles Les 4 versions Free GPT-4 Recherche dans les bibliothèques

[Free GPT-4]

[PDF] arxiv.org

Good quantum error-correcting codes exist

AR Calderbank, PW Shor - Physical Review A, 1996 - APS

A quantum error-correcting code is defined to be a unitary map** (encoding) of k qubits
(two-state quantum systems) into a subspace of the quantum state space of n qubits such …

Enregistrer Citer Cité 3253 fois Autres articles Les 21 versions Free GPT-4

[Free GPT-4]

[PDF] iacr.org

Revocation and tracing schemes for stateless receivers

D Naor, M Naor, J Lotspiech - … in Cryptology—CRYPTO 2001: 21st Annual …, 2001 - Springer

We deal with the problem of a center sending a message to a group of users such that some
subset of the users is considered revoked and should not be able to obtain the content of the …

Enregistrer Citer Cité 1653 fois Autres articles Les 24 versions Free GPT-4

[Free GPT-4]

[PDF] neurips.cc

Batched multi-armed bandits problem

Z Gao, Y Han, Z Ren, Z Zhou - Advances in Neural …, 2019 - proceedings.neurips.cc

In this paper, we study the multi-armed bandit problem in the batched setting where the
employed policy must split data into a small number of batches. While the minimax regret for …

Enregistrer Citer Cité 172 fois Autres articles Les 15 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] sciencedirect.com

The k-armed dueling bandits problem

Y Yue, J Broder, R Kleinberg, T Joachims - Journal of Computer and …, 2012 - Elsevier

We study a partial-information online-learning problem where actions are restricted to noisy
comparisons between pairs of strategies (also known as bandits). In contrast to conventional …

Enregistrer Citer Cité 412 fois Autres articles Les 19 versions Free GPT-4

[Free GPT-4]

[PDF] mlr.press

Efficient ranking from pairwise comparisons

F Wauthier, M Jordan, N Jojic - International Conference on …, 2013 - proceedings.mlr.press

The ranking of n objects based on pairwise comparisons is a core machine learning
problem, arising in recommender systems, ad placement, player ranking, biological …

Enregistrer Citer Cité 255 fois Autres articles Les 11 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] psu.edu

Large-scale validation and analysis of interleaved search evaluation

O Chapelle, T Joachims, F Radlinski… - ACM Transactions on …, 2012 - dl.acm.org

Interleaving is an increasingly popular technique for evaluating information retrieval systems
based on implicit user feedback. While a number of isolated studies have analyzed how this …

Enregistrer Citer Cité 231 fois Autres articles Les 11 versions Free GPT-4

[Free GPT-4]

[PDF] princeton.edu

How to compress interactive communication

B Barak, M Braverman, X Chen, A Rao - Proceedings of the forty-second …, 2010 - dl.acm.org

We describe new ways to simulate 2-party communication protocols to get protocols with
potentially smaller communication. We show that every communication protocol that …

Enregistrer Citer Cité 276 fois Autres articles Les 20 versions Free GPT-4

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Computing with noisy information

Principled reinforcement learning with human feedback from pairwise or k-wise comparisons

Crowdsourced data management: A survey

[LIVRE][B] Communication Complexity: and Applications

Good quantum error-correcting codes exist

Revocation and tracing schemes for stateless receivers

Batched multi-armed bandits problem

The k-armed dueling bandits problem

Efficient ranking from pairwise comparisons

Large-scale validation and analysis of interleaved search evaluation

How to compress interactive communication