Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Smoothllm: Defending large language models against jailbreaking attacks
Despite efforts to align large language models (LLMs) with human intentions, widely-used
LLMs such as GPT, Llama, and Claude are susceptible to jailbreaking attacks, wherein an …
LLMs such as GPT, Llama, and Claude are susceptible to jailbreaking attacks, wherein an …
Model-based domain generalization
Despite remarkable success in a variety of applications, it is well-known that deep learning
can fail catastrophically when presented with out-of-distribution data. Toward addressing …
can fail catastrophically when presented with out-of-distribution data. Toward addressing …
Triangular Trade-off between Robustness, Accuracy, and Fairness in Deep Neural Networks: A Survey
With the rapid development of deep learning, AI systems are being used more in complex
and important domains and necessitates the simultaneous fulfillment of multiple constraints …
and important domains and necessitates the simultaneous fulfillment of multiple constraints …
On the tradeoff between robustness and fairness
Abstract Interestingly, recent experimental results [2, 26, 22] have identified a robust fairness
phenomenon in adversarial training (AT), namely that a robust model well-trained by AT …
phenomenon in adversarial training (AT), namely that a robust model well-trained by AT …
Do wider neural networks really help adversarial robustness?
Adversarial training is a powerful type of defense against adversarial examples. Previous
empirical results suggest that adversarial training requires wider networks for better …
empirical results suggest that adversarial training requires wider networks for better …
Better safe than sorry: Preventing delusive adversaries with adversarial training
Delusive attacks aim to substantially deteriorate the test accuracy of the learning model by
slightly perturbing the features of correctly labeled training examples. By formalizing this …
slightly perturbing the features of correctly labeled training examples. By formalizing this …
The curse of overparametrization in adversarial training: Precise analysis of robust generalization for random features regression
The curse of overparametrization in adversarial training: Precise analysis of robust
generalization for random features regressi Page 1 The Annals of Statistics 2024, Vol. 52, No. 2 …
generalization for random features regressi Page 1 The Annals of Statistics 2024, Vol. 52, No. 2 …
Precise statistical analysis of classification accuracies for adversarial training
Precise statistical analysis of classification accuracies for adversarial training Page 1 The
Annals of Statistics 2022, Vol. 50, No. 4, 2127–2156 https://doi.org/10.1214/22-AOS2180 © …
Annals of Statistics 2022, Vol. 50, No. 4, 2127–2156 https://doi.org/10.1214/22-AOS2180 © …
Adversarial robustness with semi-infinite constrained learning
Despite strong performance in numerous applications, the fragility of deep learning to input
perturbations has raised serious questions about its use in safety-critical domains. While …
perturbations has raised serious questions about its use in safety-critical domains. While …
Probabilistically robust learning: Balancing average and worst-case performance
Many of the successes of machine learning are based on minimizing an averaged loss
function. However, it is well-known that this paradigm suffers from robustness issues that …
function. However, it is well-known that this paradigm suffers from robustness issues that …