Google Académico

Backdoor attacks and countermeasures in natural language processing models: A comprehensive security review

P Cheng, Z Wu, W Du, H Zhao, W Lu, G Liu - arxiv preprint arxiv …, 2023 - arxiv.org

Applicating third-party data and models has become a new paradigm for language modeling
in NLP, which also introduces some potential security vulnerabilities because attackers can …

Guardar Citar Citado por 21 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

A survey of backdoor attacks and defenses on large language models: Implications for security measures

S Zhao, M Jia, Z Guo, L Gan, X Xu, X Wu, J Fu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs), which bridge the gap between human language
understanding and complex problem-solving, achieve state-of-the-art performance on …

Guardar Citar Citado por 10 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Defending against weight-poisoning backdoor attacks for parameter-efficient fine-tuning

S Zhao, L Gan, LA Tuan, J Fu, L Lyu, M Jia… - arxiv preprint arxiv …, 2024 - arxiv.org

Recently, various parameter-efficient fine-tuning (PEFT) strategies for application to
language models have been proposed and successfully implemented. However, this raises …

Guardar Citar Citado por 19 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Beyond perplexity: Multi-dimensional safety evaluation of llm compression

Z Xu, A Gupta, T Li, O Bentham, V Srikumar - arxiv preprint arxiv …, 2024 - arxiv.org

Increasingly, model compression techniques enable large language models (LLMs) to be
deployed in real-world applications. As a result of this momentum towards local deployment …

Guardar Citar Citado por 4 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Weak-to-Strong Backdoor Attack for Large Language Models

S Zhao, L Gan, Z Guo, X Wu, L **ao, X Xu… - arxiv preprint arxiv …, 2024 - arxiv.org

Despite being widely applied due to their exceptional capabilities, Large Language Models
(LLMs) have been proven to be vulnerable to backdoor attacks. These attacks introduce …

Guardar Citar Citado por 1 Artículos relacionados Las 3 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Fewer is More: Trojan Attacks on Parameter-Efficient Fine-Tuning

L Hong, T Wang - arxiv preprint arxiv:2310.00648, 2023 - arxiv.org

Parameter-efficient fine-tuning (PEFT) enables efficient adaptation of pre-trained language
models (PLMs) to specific tasks. By tuning only a minimal set of (extra) parameters, PEFT …

Guardar Citar Citado por 3 Artículos relacionados Las 3 versiones Versión en HTML

Exploring Clean Label Backdoor Attacks and Defense in Language Models

S Zhao, LA Tuan, J Fu, J Wen… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org

Despite being widely applied, pre-trained language models have been proven vulnerable to
backdoor attacks. Backdoor attacks are designed to introduce targeted vulnerabilities into …

Guardar Citar Citado por 11 Artículos relacionados

[Free GPT-4]

[PDF] arxiv.org

SpamDam: Towards Privacy-Preserving and Adversary-Resistant SMS Spam Detection

Y Li, R Zhang, W Rong, X Mi - arxiv preprint arxiv:2404.09481, 2024 - arxiv.org

In this study, we introduce SpamDam, a SMS spam detection framework designed to
overcome key challenges in detecting and understanding SMS spam, such as the lack of …

Guardar Citar Citado por 2 Artículos relacionados Las 2 versiones Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

Persistent Backdoor Attacks in Continual Learning

Z Guo, A Kumar, R Tourani - arxiv preprint arxiv:2409.13864, 2024 - arxiv.org

Backdoor attacks pose a significant threat to neural networks, enabling adversaries to
manipulate model outputs on specific inputs, often with devastating consequences …

Guardar Citar Artículos relacionados Versión en HTML

[Free GPT-4]

[PDF] arxiv.org

DarkMind: Latent Chain-of-Thought Backdoor in Customized LLMs

Z Guo, R Tourani - arxiv preprint arxiv:2501.18617, 2025 - arxiv.org

With the growing demand for personalized AI solutions, customized LLMs have become a
preferred choice for businesses and individuals, driving the deployment of millions of AI …

Guardar Citar Artículos relacionados Versión en HTML

Crear alerta

Citar

Búsqueda avanzada

Guardado en Mi biblioteca

Adversarial clean label backdoor attacks and defenses on text classification systems

Backdoor attacks and countermeasures in natural language processing models: A comprehensive security review

A survey of backdoor attacks and defenses on large language models: Implications for security measures

Defending against weight-poisoning backdoor attacks for parameter-efficient fine-tuning

Beyond perplexity: Multi-dimensional safety evaluation of llm compression

Weak-to-Strong Backdoor Attack for Large Language Models

Fewer is More: Trojan Attacks on Parameter-Efficient Fine-Tuning

Exploring Clean Label Backdoor Attacks and Defense in Language Models

SpamDam: Towards Privacy-Preserving and Adversary-Resistant SMS Spam Detection

Persistent Backdoor Attacks in Continual Learning

DarkMind: Latent Chain-of-Thought Backdoor in Customized LLMs