- Academic Search

D Kiela, M Bartolo, Y Nie, D Kaushik, A Geiger… - arxiv preprint arxiv …, 2021 - arxiv.org

We introduce Dynabench, an open-source platform for dynamic dataset creation and model
benchmarking. Dynabench runs in a web browser and supports human-and-model-in-the …

Zapisz Cytuj Cytowane przez 426 Powiązane artykuły Wszystkie wersje 9 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Machine learning testing: Survey, landscapes and horizons

JM Zhang, M Harman, L Ma… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

This paper provides a comprehensive survey of techniques for testing machine learning
systems; Machine Learning Testing (ML testing) research. It covers 144 papers on testing …

Zapisz Cytuj Cytowane przez 1012 Powiązane artykuły Wszystkie wersje 14

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

An empirical study on robustness to spurious correlations using pre-trained language models

L Tu, G Lalwani, S Gella, H He - Transactions of the Association for …, 2020 - direct.mit.edu

Recent work has shown that pre-trained language models such as BERT improve
robustness to spurious correlations in the dataset. Intrigued by these results, we find that the …

Zapisz Cytuj Cytowane przez 191 Powiązane artykuły Wszystkie wersje 13

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Robustness gym: Unifying the NLP evaluation landscape

K Goel, N Rajani, J Vig, S Tan, J Wu, S Zheng… - arxiv preprint arxiv …, 2021 - arxiv.org

Despite impressive performance on standard benchmarks, deep neural networks are often
brittle when deployed in real-world systems. Consequently, recent research has focused on …

Zapisz Cytuj Cytowane przez 145 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Towards debiasing NLU models from unknown biases

PA Utama, NS Moosavi, I Gurevych - arxiv preprint arxiv:2009.12303, 2020 - arxiv.org

NLU models often exploit biases to achieve high dataset-specific performance without
properly learning the intended task. Recently proposed debiasing methods are shown to be …

Zapisz Cytuj Cytowane przez 152 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A fine-grained comparison of pragmatic language understanding in humans and language models

J Hu, S Floyd, O Jouravlev, E Fedorenko… - arxiv preprint arxiv …, 2022 - arxiv.org

Pragmatics and non-literal language understanding are essential to human communication,
and present a long-standing challenge for artificial language models. We perform a fine …

Zapisz Cytuj Cytowane przez 75 Powiązane artykuły Wszystkie wersje 4 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

Quality assurance strategies for machine learning applications in big data analytics: an overview

M Ogrizović, D Drašković, D Bojić - Journal of Big Data, 2024 - Springer

Abstract Machine learning (ML) models have gained significant attention in a variety of
applications, from computer vision to natural language processing, and are almost always …

Zapisz Cytuj Cytowane przez 2 Powiązane artykuły Wszystkie wersje 5

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

DISCO: Distilling counterfactuals with large language models

Z Chen, Q Gao, A Bosselut, A Sabharwal… - arxiv preprint arxiv …, 2022 - arxiv.org

Models trained with counterfactually augmented data learn representations of the causal
structure of tasks, enabling robust generalization. However, high-quality counterfactual data …

Zapisz Cytuj Cytowane przez 64 Powiązane artykuły Wszystkie wersje 6 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Are natural language inference models IMPPRESsive? Learning IMPlicature and PRESupposition

P Jeretic, A Warstadt, S Bhooshan… - arxiv preprint arxiv …, 2020 - arxiv.org

Natural language inference (NLI) is an increasingly important task for natural language
understanding, which requires one to infer whether a sentence entails another. However …

Zapisz Cytuj Cytowane przez 122 Powiązane artykuły Wszystkie wersje 5 Wersja HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Text-crs: A generalized certified robustness framework against textual adversarial attacks

X Zhang, H Hong, Y Hong, P Huang… - … IEEE Symposium on …, 2024 - ieeexplore.ieee.org

The language models, especially the basic text classification models, have been shown to
be susceptible to textual adversarial attacks such as synonym substitution and word …

Zapisz Cytuj Cytowane przez 21 Powiązane artykuły Wszystkie wersje 4

Utwórz alert

Cytuj

Szukanie zaawansowane

Zapisano w Mojej bibliotece

Analyzing compositionality-sensitivity of NLI models

Dynabench: Rethinking benchmarking in NLP

Machine learning testing: Survey, landscapes and horizons

An empirical study on robustness to spurious correlations using pre-trained language models

Robustness gym: Unifying the NLP evaluation landscape

Towards debiasing NLU models from unknown biases

A fine-grained comparison of pragmatic language understanding in humans and language models

Quality assurance strategies for machine learning applications in big data analytics: an overview

DISCO: Distilling counterfactuals with large language models

Are natural language inference models IMPPRESsive? Learning IMPlicature and PRESupposition

Text-crs: A generalized certified robustness framework against textual adversarial attacks