Clip-tuning: Towards derivative-free prompt learning with a mixture of rewards

Y Chai, S Wang, Y Sun, H Tian, H Wu… - arxiv preprint arxiv …, 2022 - arxiv.org
Derivative-free prompt learning has emerged as a lightweight alternative to prompt tuning,
which only requires model inference to optimize the prompts. However, existing work did not …

Multi-CLS BERT: An efficient alternative to traditional ensembling

HS Chang, RY Sun, K Ricci, A McCallum - arxiv preprint arxiv:2210.05043, 2022 - arxiv.org
Ensembling BERT models often significantly improves accuracy, but at the cost of
significantly more computation and memory footprint. In this work, we propose Multi-CLS …

Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance

H Lu, X Liu, Y Zhou, Q Li, K Keutzer… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent studies on deep ensembles have identified the sharpness of the local minima of
individual learners and the diversity of the ensemble members as key factors in improving …

ASPEST: Bridging the Gap Between Active Learning and Selective Prediction

J Chen, J Yoon, S Ebrahimi, S Arik, S Jha… - arxiv preprint arxiv …, 2023 - arxiv.org
Selective prediction aims to learn a reliable model that abstains from making predictions
when uncertain. These predictions can then be deferred to humans for further evaluation. As …

[PDF][PDF] Ensemble of winning tickets: pruning bidirectional encoder from the transformers attention heads for enhanced model efficiency

N Smarts, R Selvaraj, VM Kuthadi - International Journal of …, 2025 - researchgate.net
The advanced models of deep neural networks like bidirectional encoder from the
transformers (BERT) and others, poses challenges in terms of computational resources and …

[KIRJA][B] Robust Deep Learning Under Distribution Shift

J Chen - 2023 - search.proquest.com
Deep learning has achieved remarkable success in various domains, including computer
vision, natural language processing, and game playing. However, this success relies on the …

Diversifying Multilayer Perceptron Ensembles in a Truly Sparse Context

PRD Wal - 2023 - essay.utwente.nl
Artificial Neural Networks are state-of-the-art machine learning models, outperforming their
competitors in many fields. One of the major drawbacks of Artificial Neural Networks are the …

Self-supervised learning and uncertainty estimation for surgical margin detection with mass spectrometry

A Syeda - 2023 - search.proquest.com
Breast cancer represents 25% of all new cancer cases and is the second leading cause of
death from cancer in Canadian women. The preferred treatment for breast cancer patients is …

[PDF][PDF] Modeling the Multi-mode Distribution in Self-Supervised Language Models

HS Chang - 2022 - core.ac.uk
Recently, researchers have found that transformer-based language models (LMs), such as
GPT-2, can predict the next word distribution better as their sizes grow [177, 21, 97] …