Chatbot arena: An open platform for evaluating llms by human preference

WL Chiang, L Zheng, Y Sheng… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have unlocked new capabilities and applications; however,
evaluating the alignment with human preferences still poses significant challenges. To …

[PDF][PDF] Do we need hundreds of classifiers to solve real world classification problems?

M Fernández-Delgado, E Cernadas, S Barro… - The journal of machine …, 2014 - jmlr.org
We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural
networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging …

A review on instance ranking problems in statistical learning

T Werner - Machine Learning, 2022 - Springer
Ranking problems, also known as preference learning problems, define a widely spread
class of statistical learning problems with many applications, including fraud detection …

Spectral mle: Top-k rank aggregation from pairwise comparisons

Y Chen, C Suh - International Conference on Machine …, 2015 - proceedings.mlr.press
This paper explores the preference-based top-K rank aggregation problem. Suppose that a
collection of items is repeatedly compared in pairs, and one wishes to recover a consistent …

Preference-based online learning with dueling bandits: A survey

V Bengs, R Busa-Fekete, A El Mesaoudi-Paul… - Journal of Machine …, 2021 - jmlr.org
In machine learning, the notion of multi-armed bandits refers to a class of online learning
problems, in which an agent is supposed to simultaneously explore and exploit a given set …

Learning multimodal rewards from rankings

V Myers, E Biyik, N Anari… - Conference on robot …, 2022 - proceedings.mlr.press
Learning from human feedback has shown to be a useful approach in acquiring robot
reward functions. However, expert feedback is often assumed to be drawn from an …

[PDF][PDF] Effective sampling and learning for mallows models with pairwise-preference data.

T Lu, C Boutilier - J. Mach. Learn. Res., 2014 - jmlr.org
Learning preference distributions is a critical problem in many areas (eg, recommender
systems, IR, social choice). However, many existing learning and inference methods impose …

Online rank elicitation for plackett-luce: A dueling bandits approach

B Szörényi, R Busa-Fekete, A Paul… - Advances in neural …, 2015 - proceedings.neurips.cc
We study the problem of online rank elicitation, assuming that rankings of a set of
alternatives obey the Plackett-Luce distribution. Following the setting of the dueling bandits …

Subset selection based on multiple rankings in the presence of bias: Effectiveness of fairness constraints for multiwinner voting score functions

N Boehmer, LE Celis, L Huang… - International …, 2023 - proceedings.mlr.press
We consider the problem of subset selection where one is given multiple rankings of items
and the goal is to select the highest" quality" subset. Score functions from the multiwinner …

Properties of the mallows model depending on the number of alternatives: a warning for an experimentalist

N Boehmer, P Faliszewski… - … Conference on Machine …, 2023 - proceedings.mlr.press
The Mallows model is a popular distribution for ranked data. We empirically and theoretically
analyze how the properties of rankings sampled from the Mallows model change when …