[HTML][HTML] A comparison review of transfer learning and self-supervised learning: Definitions, applications, advantages and limitations

Z Zhao, L Alzubaidi, J Zhang, Y Duan, Y Gu - Expert Systems with …, 2024‏ - Elsevier
Deep learning has emerged as a powerful tool in various domains, revolutionising machine
learning research. However, one persistent challenge is the scarcity of labelled training …

Table pre-training: A survey on model architectures, pre-training objectives, and downstream tasks

H Dong, Z Cheng, X He, M Zhou, A Zhou… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Since a vast number of tables can be easily collected from web pages, spreadsheets, PDFs,
and various other document types, a flurry of table pre-training frameworks have been …

Why do tree-based models still outperform deep learning on typical tabular data?

L Grinsztajn, E Oyallon… - Advances in neural …, 2022‏ - proceedings.neurips.cc
While deep learning has enabled tremendous progress on text and image datasets, its
superiority on tabular data is not clear. We contribute extensive benchmarks of standard and …

Deep neural networks and tabular data: A survey

V Borisov, T Leemann, K Seßler, J Haug… - IEEE transactions on …, 2022‏ - ieeexplore.ieee.org
Heterogeneous tabular data are the most commonly used form of data and are essential for
numerous critical and computationally demanding applications. On homogeneous datasets …

Tabllm: Few-shot classification of tabular data with large language models

S Hegselmann, A Buendia, H Lang… - International …, 2023‏ - proceedings.mlr.press
We study the application of large language models to zero-shot and few-shot classification
of tabular data. We prompt the large language model with a serialization of the tabular data …

When do neural nets outperform boosted trees on tabular data?

D McElfresh, S Khandagale… - Advances in …, 2024‏ - proceedings.neurips.cc
Tabular data is one of the most commonly used types of data in machine learning. Despite
recent advances in neural nets (NNs) for tabular data, there is still an active discussion on …

Transtab: Learning transferable tabular transformers across tables

Z Wang, J Sun - Advances in Neural Information Processing …, 2022‏ - proceedings.neurips.cc
Tabular data (or tables) are the most widely used data format in machine learning (ML).
However, ML models often assume the table structure keeps fixed in training and testing …

Subtab: Subsetting features of tabular data for self-supervised representation learning

T Ucar, E Hajiramezanali… - Advances in Neural …, 2021‏ - proceedings.neurips.cc
Self-supervised learning has been shown to be very effective in learning useful
representations, and yet much of the success is achieved in data types such as images …

Saint: Improved neural networks for tabular data via row attention and contrastive pre-training

G Somepalli, M Goldblum, A Schwarzschild… - arxiv preprint arxiv …, 2021‏ - arxiv.org
Tabular data underpins numerous high-impact applications of machine learning from fraud
detection to genomics and healthcare. Classical approaches to solving tabular problems …

Scarf: Self-supervised contrastive learning using random feature corruption

D Bahri, H Jiang, Y Tay, D Metzler - arxiv preprint arxiv:2106.15147, 2021‏ - arxiv.org
Self-supervised contrastive representation learning has proved incredibly successful in the
vision and natural language domains, enabling state-of-the-art performance with orders of …