Open graph benchmark: Datasets for machine learning on graphs

W Hu, M Fey, M Zitnik, Y Dong, H Ren… - Advances in neural …, 2020 - proceedings.neurips.cc
Abstract We present the Open Graph Benchmark (OGB), a diverse set of challenging and
realistic benchmark datasets to facilitate scalable, robust, and reproducible graph machine …

Tabpfn: A transformer that solves small tabular classification problems in a second

N Hollmann, S Müller, K Eggensperger… - arxiv preprint arxiv …, 2022 - arxiv.org
We present TabPFN, a trained Transformer that can do supervised classification for small
tabular datasets in less than a second, needs no hyperparameter tuning and is competitive …

Confident learning: Estimating uncertainty in dataset labels

C Northcutt, L Jiang, I Chuang - Journal of Artificial Intelligence Research, 2021 - jair.org
Learning exists in the context of data, yet notions of confidence typically focus on model
predictions, not label quality. Confident learning (CL) is an alternative approach which …

Auto-sklearn 2.0: Hands-free automl via meta-learning

M Feurer, K Eggensperger, S Falkner… - Journal of Machine …, 2022 - jmlr.org
Automated Machine Learning (AutoML) supports practitioners and researchers with the
tedious task of designing machine learning pipelines and has recently achieved substantial …

Well-tuned simple nets excel on tabular datasets

A Kadra, M Lindauer, F Hutter… - Advances in neural …, 2021 - proceedings.neurips.cc
Tabular datasets are the last" unconquered castle" for deep learning, with traditional ML
methods like Gradient-Boosted Decision Trees still performing strongly even against recent …

Auto-pytorch: Multi-fidelity metalearning for efficient and robust autodl

L Zimmer, M Lindauer, F Hutter - IEEE transactions on pattern …, 2021 - ieeexplore.ieee.org
While early AutoML frameworks focused on optimizing traditional ML pipelines and their
hyperparameters, a recent trend in AutoML is to focus on neural architecture search. In this …

Sampling weights of deep neural networks

EL Bolager, I Burak, C Datar, Q Sun… - Advances in Neural …, 2023 - proceedings.neurips.cc
We introduce a probability distribution, combined with an efficient sampling algorithm, for
weights and biases of fully-connected neural networks. In a supervised learning context, no …

Data-oob: Out-of-bag estimate as a simple and efficient data value

Y Kwon, J Zou - International Conference on Machine …, 2023 - proceedings.mlr.press
Data valuation is a powerful framework for providing statistical insights into which data are
beneficial or detrimental to model training. Many Shapley-based data valuation methods …

shapiq: Shapley interactions for machine learning

M Muschalik, H Baniecki, F Fumagalli… - Advances in …, 2025 - proceedings.neurips.cc
Originally rooted in game theory, the Shapley Value (SV) has recently become an important
tool in machine learning research. Perhaps most notably, it is used for feature attribution and …

Large language models for automated data science: Introducing caafe for context-aware automated feature engineering

N Hollmann, S Müller, F Hutter - Advances in Neural …, 2024 - proceedings.neurips.cc
As the field of automated machine learning (AutoML) advances, it becomes increasingly
important to incorporate domain knowledge into these systems. We present an approach for …