A survey on evaluation of large language models

Y Chang, X Wang, J Wang, Y Wu, L Yang… - ACM transactions on …, 2024 - dl.acm.org
Large language models (LLMs) are gaining increasing popularity in both academia and
industry, owing to their unprecedented performance in various applications. As LLMs …

A comprehensive survey on test-time adaptation under distribution shifts

J Liang, R He, T Tan - International Journal of Computer Vision, 2025 - Springer
Abstract Machine learning methods strive to acquire a robust model during the training
process that can effectively generalize to test samples, even in the presence of distribution …

Robust and data-efficient generalization of self-supervised machine learning for diagnostic imaging

S Azizi, L Culp, J Freyberg, B Mustafa, S Baur… - Nature Biomedical …, 2023 - nature.com
Abstract Machine-learning models for medical tasks can match or surpass the performance
of clinical experts. However, in settings differing from those of the training dataset, the …

Robust test-time adaptation in dynamic scenarios

L Yuan, B **e, S Li - … of the IEEE/CVF Conference on …, 2023 - openaccess.thecvf.com
Test-time adaptation (TTA) intends to adapt the pretrained model to test distributions with
only unlabeled test data streams. Most of the previous TTA methods have achieved great …

Teaching models to express their uncertainty in words

S Lin, J Hilton, O Evans - arxiv preprint arxiv:2205.14334, 2022 - arxiv.org
We show that a GPT-3 model can learn to express uncertainty about its own answers in
natural language--without use of model logits. When given a question, the model generates …

Single-source domain expansion network for cross-scene hyperspectral image classification

Y Zhang, W Li, W Sun, R Tao… - IEEE Transactions on …, 2023 - ieeexplore.ieee.org
Currently, cross-scene hyperspectral image (HSI) classification has drawn increasing
attention. It is necessary to train a model only on source domain (SD) and directly …

Towards out-of-distribution generalization: A survey

J Liu, Z Shen, Y He, X Zhang, R Xu, H Yu… - arxiv preprint arxiv …, 2021 - arxiv.org
Traditional machine learning paradigms are based on the assumption that both training and
test data follow the same statistical pattern, which is mathematically referred to as …

Federated learning for medical image analysis: A survey

H Guan, PT Yap, A Bozoki, M Liu - Pattern Recognition, 2024 - Elsevier
Abstract Machine learning in medical imaging often faces a fundamental dilemma, namely,
the small sample size problem. Many recent studies suggest using multi-domain data …

Federated learning for generalization, robustness, fairness: A survey and benchmark

W Huang, M Ye, Z Shi, G Wan, H Li… - IEEE Transactions on …, 2024 - ieeexplore.ieee.org
Federated learning has emerged as a promising paradigm for privacy-preserving
collaboration among different parties. Recently, with the popularity of federated learning, an …

Exact feature distribution matching for arbitrary style transfer and domain generalization

Y Zhang, M Li, R Li, K Jia… - Proceedings of the IEEE …, 2022 - openaccess.thecvf.com
Arbitrary style transfer (AST) and domain generalization (DG) are important yet challenging
visual learning tasks, which can be cast as a feature distribution matching problem. With the …