A survey on evaluation of out-of-distribution generalization
Machine learning models, while progressively advanced, rely heavily on the IID assumption,
which is often unfulfilled in practice due to inevitable distribution shifts. This renders them …
which is often unfulfilled in practice due to inevitable distribution shifts. This renders them …
Energy-based automated model evaluation
The conventional evaluation protocols on machine learning models rely heavily on a
labeled, iid-assumed testing dataset, which is not often present in real world applications …
labeled, iid-assumed testing dataset, which is not often present in real world applications …
Test optimization in DNN testing: a survey
This article presents a comprehensive survey on test optimization in deep neural network
(DNN) testing. Here, test optimization refers to testing with low data labeling effort. We …
(DNN) testing. Here, test optimization refers to testing with low data labeling effort. We …
Source-Free Domain-Invariant Performance Prediction
Accurately estimating model performance poses a significant challenge, particularly in
scenarios where the source and target domains follow different data distributions. Most …
scenarios where the source and target domains follow different data distributions. Most …
Label-free evaluation for performance of fault diagnosis model on unknown distribution dataset
Real-time data may undergo distribution drift due to changes in operating conditions and
other factors, which can affect the classification accuracy of online fault diagnosis models …
other factors, which can affect the classification accuracy of online fault diagnosis models …
Cifar-10-warehouse: Broad and more realistic testbeds in model generalization analysis
Analyzing model performance in various unseen environments is a critical research problem
in the machine learning community. To study this problem, it is important to construct a …
in the machine learning community. To study this problem, it is important to construct a …
Active Testing of Large Language Model via Multi-Stage Sampling
Performance evaluation plays a crucial role in the development life cycle of large language
models (LLMs). It estimates the model's capability, elucidates behavior characteristics, and …
models (LLMs). It estimates the model's capability, elucidates behavior characteristics, and …
Learning diverse features in vision transformers for improved generalization
Deep learning models often rely only on a small set of features even when there is a rich set
of predictive signals in the training data. This makes models brittle and sensitive to …
of predictive signals in the training data. This makes models brittle and sensitive to …
Methodology for Evaluating the Generalization of ResNet
A Du, Q Zhou, Y Dai - Applied Sciences, 2024 - mdpi.com
Convolutional neural networks (CNNs) have achieved promising results in many tasks, and
evaluating the model's generalization ability based on the trained model and training data is …
evaluating the model's generalization ability based on the trained model and training data is …
Towards Efficient Multi-Domain Knowledge Fusion Adaptation via Low-Rank Reparameterization and Noisy Label Learning: Application to Source-Free Cross …
Y Lin, Y Wang, M Zhang, H Cao, L Ma… - IEEE Internet of …, 2024 - ieeexplore.ieee.org
Domain adaptation in fault diagnosis can efficiently handle different data distributions by co-
training source and target domain data. However, the source domain data may not be …
training source and target domain data. However, the source domain data may not be …