- Academic Search

H Yu, J Liu, X Zhang, J Wu, P Cui - arxiv preprint arxiv:2403.01874, 2024 - arxiv.org

Machine learning models, while progressively advanced, rely heavily on the IID assumption,
which is often unfulfilled in practice due to inevitable distribution shifts. This renders them …

Save Cite Cited by 9 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Energy-based automated model evaluation

R Peng, H Zou, H Wang, Y Zeng, Z Huang… - arxiv preprint arxiv …, 2024 - arxiv.org

The conventional evaluation protocols on machine learning models rely heavily on a
labeled, iid-assumed testing dataset, which is not often present in real world applications …

Save Cite Cited by 16 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] acm.org

Test optimization in DNN testing: a survey

Q Hu, Y Guo, X **e, M Cordy, L Ma… - ACM Transactions on …, 2024 - dl.acm.org

This article presents a comprehensive survey on test optimization in deep neural network
(DNN) testing. Here, test optimization refers to testing with low data labeling effort. We …

Save Cite Cited by 7 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Source-Free Domain-Invariant Performance Prediction

E Khramtsova, M Baktashmotlagh, G Zuccon… - … on Computer Vision, 2024 - Springer

Accurately estimating model performance poses a significant challenge, particularly in
scenarios where the source and target domains follow different data distributions. Most …

Save Cite Cited by 2 Related articles All 10 versions Free GPT-4

Label-free evaluation for performance of fault diagnosis model on unknown distribution dataset

Z Liu, H Zheng, H Liu, W Jia, J Tan - Advanced Engineering Informatics, 2024 - Elsevier

Real-time data may undergo distribution drift due to changes in operating conditions and
other factors, which can affect the classification accuracy of online fault diagnosis models …

Save Cite Cited by 1 Related articles

[Free GPT-4]

[PDF] arxiv.org

Cifar-10-warehouse: Broad and more realistic testbeds in model generalization analysis

X Sun, X Leng, Z Wang, Y Yang, Z Huang… - arxiv preprint arxiv …, 2023 - arxiv.org

Analyzing model performance in various unseen environments is a critical research problem
in the machine learning community. To study this problem, it is important to construct a …

Save Cite Cited by 5 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Active Testing of Large Language Model via Multi-Stage Sampling

Y Huang, J Song, Q Hu, F Juefei-Xu, L Ma - arxiv preprint arxiv …, 2024 - arxiv.org

Performance evaluation plays a crucial role in the development life cycle of large language
models (LLMs). It estimates the model's capability, elucidates behavior characteristics, and …

Save Cite Cited by 2 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Learning diverse features in vision transformers for improved generalization

AM Nicolicioiu, AL Nicolicioiu, B Alexe… - arxiv preprint arxiv …, 2023 - arxiv.org

Deep learning models often rely only on a small set of features even when there is a rich set
of predictive signals in the training data. This makes models brittle and sensitive to …

Save Cite Cited by 2 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mdpi.com

Methodology for Evaluating the Generalization of ResNet

A Du, Q Zhou, Y Dai - Applied Sciences, 2024 - mdpi.com

Convolutional neural networks (CNNs) have achieved promising results in many tasks, and
evaluating the model's generalization ability based on the trained model and training data is …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 View as HTML

Towards Efficient Multi-Domain Knowledge Fusion Adaptation via Low-Rank Reparameterization and Noisy Label Learning: Application to Source-Free Cross …

Y Lin, Y Wang, M Zhang, H Cao, L Ma… - IEEE Internet of …, 2024 - ieeexplore.ieee.org

Domain adaptation in fault diagnosis can efficiently handle different data distributions by co-
training source and target domain data. However, the source domain data may not be …

Save Cite Related articles

Create alert

Cite

Advanced search

Saved to My library

Confidence and dispersity speak: Characterizing prediction matrix for unsupervised accuracy...

A survey on evaluation of out-of-distribution generalization

Energy-based automated model evaluation

Test optimization in DNN testing: a survey

Source-Free Domain-Invariant Performance Prediction

Label-free evaluation for performance of fault diagnosis model on unknown distribution dataset

Cifar-10-warehouse: Broad and more realistic testbeds in model generalization analysis

Active Testing of Large Language Model via Multi-Stage Sampling

Learning diverse features in vision transformers for improved generalization

Methodology for Evaluating the Generalization of ResNet

Towards Efficient Multi-Domain Knowledge Fusion Adaptation via Low-Rank Reparameterization and Noisy Label Learning: Application to Source-Free Cross …