Google Académico

Weighted automata extraction and explanation of recurrent neural networks for natural language tasks

Y Zhang, H He, J Zhu, H Chen, Y Wang… - arxiv preprint arxiv …, 2024 - arxiv.org

Adversarial Training (AT), which adversarially perturb the input samples during training, has
been acknowledged as one of the most effective defenses against adversarial attacks, yet …

Guardar Citar Citado por 8 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Boosting jailbreak attack with momentum

Y Zhang, Z Wei - arxiv preprint arxiv:2405.01229, 2024 - arxiv.org

Large Language Models (LLMs) have achieved remarkable success across diverse tasks,
yet they remain vulnerable to adversarial attacks, notably the well-documented\textit …

Guardar Citar Citado por 14 Artículos relacionados Las 4 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

LUNA: A Model-Based Universal Analysis Framework for Large Language Models

D Song, X **e, J Song, D Zhu, Y Huang… - IEEE Transactions …, 2024 - ieeexplore.ieee.org

Over the past decade, Artificial Intelligence (AI) has had great success recently and is being
used in a wide range of academic and industrial fields. More recently, Large Language …

Guardar Citar Citado por 6 Artículos relacionados Las 5 versiones

[Free GPT-4]

[PDF] arxiv.org

Automata Extraction from Transformers

Y Zhang, Z Wei, M Sun - arxiv preprint arxiv:2406.05564, 2024 - arxiv.org

In modern machine (ML) learning systems, Transformer-based architectures have achieved
milestone success across a broad spectrum of tasks, yet understanding their operational …

Guardar Citar Citado por 1 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] mdpi.com

Particle Swarm Optimization-Based Model Abstraction and Explanation Generation for a Recurrent Neural Network

Y Liu, H Wang, Y Ma - Algorithms, 2024 - mdpi.com

In text classifier models, the complexity of recurrent neural networks (RNNs) is very high
because of the vast state space and uncertainty of transitions, which makes the RNN …

Guardar Citar Artículos relacionados Las 2 versiones En caché

Enhancing Adversarial Attacks: The Similar Target Method

S Zhang, Z Wang, Z Zhou, J Liu… - 2024 International Joint …, 2024 - ieeexplore.ieee.org

Adversarial examples are notably characterized by their strong transferability, allowing
attackers to craft these examples on their models and subsequently deploy them against …

Guardar Citar Artículos relacionados

[Free GPT-4]

[PDF] sciltp.com

Adaptive Resilience via Probabilistic Automaton: Safeguarding Multi-Agent Systems from Leader Missing Attacks

K Wang, X Gong - Applied Mathematics and Statistics, 2024 - sciltp.com

The resilience of leader-following structures has been a hotspot in both academic and
industrial research. Existing studies mainly focus on maintaining follower coherence, usually …

Guardar Citar Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Causal Abstraction in Model Interpretability: A Compact Survey

Y Zhang - arxiv preprint arxiv:2410.20161, 2024 - arxiv.org

The pursuit of interpretable artificial intelligence has led to significant advancements in the
development of methods that aim to explain the decision-making processes of complex …

Guardar Citar Artículos relacionados Las 2 versiones Versión en HTML

Artificial instinct

Y Li, J Wang - International Conference on Algorithms, High …, 2024 - spiedigitallibrary.org

Artificial Intelligence (AI) has made remarkable advancements, surpassing human
capabilities in various domains. This paper delves into cutting-edge AI technologies …

Guardar Citar Artículos relacionados Las 2 versiones

[Free GPT-4]

[PDF] chencheng.me

[PDF][PDF] On the Robustness of In-Context Learning with Noisy Labels: Train, Inference, and Beyond

C Cheng, H Wen, X Yu, Z Wei - chencheng.me

Abstract Recently, the mysterious In-Context Learning (ICL) ability of Transformer
architecture, particularly in large language models, has garnered considerable research …

Guardar Citar Artículos relacionados Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Weighted automata extraction and explanation of recurrent neural networks for natural language tasks

On the duality between sharpness-aware minimization and adversarial training

Boosting jailbreak attack with momentum

LUNA: A Model-Based Universal Analysis Framework for Large Language Models

Automata Extraction from Transformers

Particle Swarm Optimization-Based Model Abstraction and Explanation Generation for a Recurrent Neural Network

Enhancing Adversarial Attacks: The Similar Target Method

Adaptive Resilience via Probabilistic Automaton: Safeguarding Multi-Agent Systems from Leader Missing Attacks

Causal Abstraction in Model Interpretability: A Compact Survey

Artificial instinct

[PDF][PDF] On the Robustness of In-Context Learning with Noisy Labels: Train, Inference, and Beyond