Weak-to-strong generalization: Eliciting strong capabilities with weak supervision

C Burns, P Izmailov, JH Kirchner, B Baker… - arxiv preprint arxiv …, 2023 - arxiv.org
Widely used alignment techniques, such as reinforcement learning from human feedback
(RLHF), rely on the ability of humans to supervise model behavior-for example, to evaluate …

Freematch: Self-adaptive thresholding for semi-supervised learning

Y Wang, H Chen, Q Heng, W Hou, Y Fan, Z Wu… - arxiv preprint arxiv …, 2022 - arxiv.org
Pseudo labeling and consistency regularization approaches with confidence-based
thresholding have made great progress in semi-supervised learning (SSL). In this paper, we …

Causal inference in natural language processing: Estimation, prediction, interpretation and beyond

A Feder, KA Keith, E Manzoor, R Pryzant… - Transactions of the …, 2022 - direct.mit.edu
A fundamental goal of scientific research is to learn about causal relationships. However,
despite its critical role in the life and social sciences, causality has not had the same …

[HTML][HTML] Self-training: A survey

MR Amini, V Feofanov, L Pauletto, L Hadjadj… - Neurocomputing, 2025 - Elsevier
Self-training methods have gained significant attention in recent years due to their
effectiveness in leveraging small labeled datasets and large unlabeled observations for …

Cycle self-training for domain adaptation

H Liu, J Wang, M Long - Advances in Neural Information …, 2021 - proceedings.neurips.cc
Mainstream approaches for unsupervised domain adaptation (UDA) learn domain-invariant
representations to narrow the domain shift, which are empirically effective but theoretically …

Test time adaptation via conjugate pseudo-labels

S Goyal, M Sun, A Raghunathan… - Advances in Neural …, 2022 - proceedings.neurips.cc
Test-time adaptation (TTA) refers to adapting neural networks to distribution shifts,
specifically with just access to unlabeled test samples from the new domain at test-time …

Theoretical analysis of self-training with deep networks on unlabeled data

C Wei, K Shen, Y Chen, T Ma - arxiv preprint arxiv:2010.03622, 2020 - arxiv.org
Self-training algorithms, which train a model to fit pseudolabels predicted by another
previously-learned model, have been very successful for learning with unlabeled data using …

Theoretical analysis of weak-to-strong generalization

H Lang, D Sontag… - Advances in Neural …, 2025 - proceedings.neurips.cc
Strong student models can learn from weaker teachers: when trained on the predictions of a
weaker model, a strong pretrained student can learn to correct the weak model's errors and …

Robust learning with progressive data expansion against spurious correlation

Y Deng, Y Yang, B Mirzasoleiman… - Advances in neural …, 2023 - proceedings.neurips.cc
While deep learning models have shown remarkable performance in various tasks, they are
susceptible to learning non-generalizable _spurious features_ rather than the core features …

Masktune: Mitigating spurious correlations by forcing to explore

S Asgari, A Khani, F Khani, A Gholami… - Advances in …, 2022 - proceedings.neurips.cc
A fundamental challenge of over-parameterized deep learning models is learning
meaningful data representations that yield good performance on a downstream task without …