On the duality between sharpness-aware minimization and adversarial training
Adversarial Training (AT), which adversarially perturb the input samples during training, has
been acknowledged as one of the most effective defenses against adversarial attacks, yet …
been acknowledged as one of the most effective defenses against adversarial attacks, yet …
Boosting jailbreak attack with momentum
Large Language Models (LLMs) have achieved remarkable success across diverse tasks,
yet they remain vulnerable to adversarial attacks, notably the well-documented\textit …
yet they remain vulnerable to adversarial attacks, notably the well-documented\textit …
LUNA: A Model-Based Universal Analysis Framework for Large Language Models
Over the past decade, Artificial Intelligence (AI) has had great success recently and is being
used in a wide range of academic and industrial fields. More recently, Large Language …
used in a wide range of academic and industrial fields. More recently, Large Language …
Automata Extraction from Transformers
In modern machine (ML) learning systems, Transformer-based architectures have achieved
milestone success across a broad spectrum of tasks, yet understanding their operational …
milestone success across a broad spectrum of tasks, yet understanding their operational …
Particle Swarm Optimization-Based Model Abstraction and Explanation Generation for a Recurrent Neural Network
Y Liu, H Wang, Y Ma - Algorithms, 2024 - mdpi.com
In text classifier models, the complexity of recurrent neural networks (RNNs) is very high
because of the vast state space and uncertainty of transitions, which makes the RNN …
because of the vast state space and uncertainty of transitions, which makes the RNN …
Enhancing Adversarial Attacks: The Similar Target Method
S Zhang, Z Wang, Z Zhou, J Liu… - 2024 International Joint …, 2024 - ieeexplore.ieee.org
Adversarial examples are notably characterized by their strong transferability, allowing
attackers to craft these examples on their models and subsequently deploy them against …
attackers to craft these examples on their models and subsequently deploy them against …
Adaptive Resilience via Probabilistic Automaton: Safeguarding Multi-Agent Systems from Leader Missing Attacks
K Wang, X Gong - Applied Mathematics and Statistics, 2024 - sciltp.com
The resilience of leader-following structures has been a hotspot in both academic and
industrial research. Existing studies mainly focus on maintaining follower coherence, usually …
industrial research. Existing studies mainly focus on maintaining follower coherence, usually …
Causal Abstraction in Model Interpretability: A Compact Survey
Y Zhang - arxiv preprint arxiv:2410.20161, 2024 - arxiv.org
The pursuit of interpretable artificial intelligence has led to significant advancements in the
development of methods that aim to explain the decision-making processes of complex …
development of methods that aim to explain the decision-making processes of complex …
Artificial instinct
Y Li, J Wang - International Conference on Algorithms, High …, 2024 - spiedigitallibrary.org
Artificial Intelligence (AI) has made remarkable advancements, surpassing human
capabilities in various domains. This paper delves into cutting-edge AI technologies …
capabilities in various domains. This paper delves into cutting-edge AI technologies …
[PDF][PDF] On the Robustness of In-Context Learning with Noisy Labels: Train, Inference, and Beyond
C Cheng, H Wen, X Yu, Z Wei - chencheng.me
Abstract Recently, the mysterious In-Context Learning (ICL) ability of Transformer
architecture, particularly in large language models, has garnered considerable research …
architecture, particularly in large language models, has garnered considerable research …