Академия Google

Backdoor pre-trained models can transfer to all

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Backdoor attacks and defenses targeting multi-domain ai models: A comprehensive review

S Zhang, Y Pan, Q Liu, Z Yan, KKR Choo… - ACM Computing …, 2024 - dl.acm.org

Since the emergence of security concerns in artificial intelligence (AI), there has been
significant attention devoted to the examination of backdoor attacks. Attackers can utilize …

Сохранить Цитировать Цитируется: 5 Похожие статьи

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

A survey on backdoor attack and defense in natural language processing

X Sheng, Z Han, P Li, X Chang - 2022 IEEE 22nd International …, 2022 - ieeexplore.ieee.org

Deep learning is becoming increasingly popular in real-life applications, especially in
natural language processing (NLP). Users often choose training outsourcing or adopt third …

Сохранить Цитировать Цитируется: 32 Похожие статьи Все версии статьи (6)

[Free GPT-4]
[DeepSeek]

[PDF] springer.com

A survey of safety and trustworthiness of large language models through the lens of verification and validation

X Huang, W Ruan, W Huang, G **, Y Dong… - Artificial Intelligence …, 2024 - Springer

Large language models (LLMs) have exploded a new heatwave of AI for their ability to
engage end-users in human-level conversations with detailed and articulate answers across …

Сохранить Цитировать Цитируется: 93 Похожие статьи Все версии статьи (13)

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

Watch out for your agents! investigating backdoor threats to llm-based agents

W Yang, X Bi, Y Lin, S Chen… - Advances in Neural …, 2025 - proceedings.neurips.cc

Driven by the rapid development of Large Language Models (LLMs), LLM-based agents
have been developed to handle various real-world applications, including finance …

Сохранить Цитировать Цитируется: 31 Похожие статьи Все версии статьи (6) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] usenix.org

Formalizing and benchmarking prompt injection attacks and defenses

Y Liu, Y Jia, R Geng, J Jia, NZ Gong - 33rd USENIX Security Symposium …, 2024 - usenix.org

A prompt injection attack aims to inject malicious instruction/data into the input of an LLM-
Integrated Application such that it produces results as an attacker desires. Existing works are …

Сохранить Цитировать Цитируется: 54 Похожие статьи Все версии статьи (9) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Revisiting the assumption of latent separability for backdoor defenses

X Qi, T **e, Y Li, S Mahloujifar… - The eleventh international …, 2023 - openreview.net

Recent studies revealed that deep learning is susceptible to backdoor poisoning attacks. An
adversary can embed a hidden backdoor into a model to manipulate its predictions by only …

Сохранить Цитировать Цитируется: 103 Похожие статьи Все версии статьи (3) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Badchain: Backdoor chain-of-thought prompting for large language models

Z **ang, F Jiang, Z **ong, B Ramasubramanian… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) are shown to benefit from chain-of-thought (COT) prompting,
particularly when tackling tasks that require systematic reasoning processes. On the other …

Сохранить Цитировать Цитируется: 61 Похожие статьи Все версии статьи (7) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] thecvf.com

Detecting backdoors in pre-trained encoders

S Feng, G Tao, S Cheng, G Shen… - Proceedings of the …, 2023 - openaccess.thecvf.com

Self-supervised learning in computer vision trains on unlabeled data, such as images or
(image, text) pairs, to obtain an image encoder that learns high-quality embeddings for input …

Сохранить Цитировать Цитируется: 55 Похожие статьи Все версии статьи (8) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Privacy in large language models: Attacks, defenses and future directions

H Li, Y Chen, J Luo, J Wang, H Peng, Y Kang… - arxiv preprint arxiv …, 2023 - arxiv.org

The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …

Сохранить Цитировать Цитируется: 58 Похожие статьи Все версии статьи (2) В виде HTML

[Free GPT-4]
[DeepSeek]

[PDF] neurips.cc

A unified evaluation of textual backdoor learning: Frameworks and benchmarks

G Cui, L Yuan, B He, Y Chen… - Advances in Neural …, 2022 - proceedings.neurips.cc

Textual backdoor attacks are a kind of practical threat to NLP systems. By injecting a
backdoor in the training phase, the adversary could control model predictions via predefined …

Сохранить Цитировать Цитируется: 87 Похожие статьи Все версии статьи (7) В виде HTML

Создать оповещение

Цитировать

Расширенный поиск

Сохранено в вашей библиотеке

Backdoor pre-trained models can transfer to all

Backdoor attacks and defenses targeting multi-domain ai models: A comprehensive review

A survey on backdoor attack and defense in natural language processing

A survey of safety and trustworthiness of large language models through the lens of verification and validation

Watch out for your agents! investigating backdoor threats to llm-based agents

Formalizing and benchmarking prompt injection attacks and defenses

Revisiting the assumption of latent separability for backdoor defenses

Badchain: Backdoor chain-of-thought prompting for large language models

Detecting backdoors in pre-trained encoders

Privacy in large language models: Attacks, defenses and future directions

A unified evaluation of textual backdoor learning: Frameworks and benchmarks