- Academic Search

Backdoor removal for generative large language models

Turnitin 降AI改写早检测系统早降重系统 Turnitin-UK版万方检测-期刊版维普编辑部版 Grammarly检测 Paperpass检测 checkpass检测 PaperYY检测

Backdoor attacks and defenses targeting multi-domain ai models: A comprehensive review

S Zhang, Y Pan, Q Liu, Z Yan, KKR Choo… - ACM Computing …, 2024 - dl.acm.org

Since the emergence of security concerns in artificial intelligence (AI), there has been
significant attention devoted to the examination of backdoor attacks. Attackers can utilize …

บันทึก อ้างอิง อ้างโดย5 บทความที่เกี่ยวข้อง

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Privacy in large language models: Attacks, defenses and future directions

H Li, Y Chen, J Luo, J Wang, H Peng, Y Kang… - arxiv preprint arxiv …, 2023 - arxiv.org

The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …

บันทึก อ้างอิง อ้างโดย59 บทความที่เกี่ยวข้อง ทั้งหมด 2 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Backdoorllm: A comprehensive benchmark for backdoor attacks on large language models

Y Li, H Huang, Y Zhao, X Ma, J Sun - arxiv preprint arxiv:2408.12798, 2024 - arxiv.org

Generative Large Language Models (LLMs) have made significant strides across various
tasks, but they remain vulnerable to backdoor attacks, where specific triggers in the prompt …

บันทึก อ้างอิง อ้างโดย16 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Cleangen: Mitigating backdoor attacks for generation tasks in large language models

Y Li, Z Xu, F Jiang, L Niu, D Sahabandu… - arxiv preprint arxiv …, 2024 - arxiv.org

The remarkable performance of large language models (LLMs) in generation tasks has
enabled practitioners to leverage publicly available models to power custom applications …

บันทึก อ้างอิง อ้างโดย7 บทความที่เกี่ยวข้อง ทั้งหมด 4 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Text-tuple-table: Towards information integration in text-to-table generation via global tuple extraction

Z Deng, C Chan, W Wang, Y Sun, W Fan… - arxiv preprint arxiv …, 2024 - arxiv.org

The task of condensing large chunks of textual information into concise and structured tables
has gained attention recently due to the emergence of Large Language Models (LLMs) and …

บันทึก อ้างอิง อ้างโดย8 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Negotiationtom: A benchmark for stress-testing machine theory of mind on negotiation surrounding

C Chan, C Jiayang, Y Yim, Z Deng, W Fan, H Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Large Language Models (LLMs) have sparked substantial interest and debate concerning
their potential emergence of Theory of Mind (ToM) ability. Theory of mind evaluations …

บันทึก อ้างอิง อ้างโดย15 บทความที่เกี่ยวข้อง ทั้งหมด 6 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Beear: Embedding-based adversarial removal of safety backdoors in instruction-tuned language models

Y Zeng, W Sun, TN Huynh, D Song, B Li… - arxiv preprint arxiv …, 2024 - arxiv.org

Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of
unsafe behaviors while evading detection during normal interactions. The high …

บันทึก อ้างอิง อ้างโดย16 บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Safety at Scale: A Comprehensive Survey of Large Model Safety

X Ma, Y Gao, Y Wang, R Wang, X Wang, Y Sun… - arxiv preprint arxiv …, 2025 - arxiv.org

The rapid advancement of large models, driven by their exceptional abilities in learning and
generalization through large-scale pre-training, has reshaped the landscape of Artificial …

บันทึก อ้างอิง อ้างโดย1 บทความที่เกี่ยวข้อง ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning

T Fu, M Sharma, P Torr, SB Cohen, D Krueger… - arxiv preprint arxiv …, 2024 - arxiv.org

Preference learning is a central component for aligning current LLMs, but this process can
be vulnerable to data poisoning attacks. To address this concern, we introduce …

บันทึก อ้างอิง อ้างโดย2 บทความที่เกี่ยวข้อง ทั้งหมด 3 ฉบับ ดูในรูปแบบ HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

ECON: On the Detection and Resolution of Evidence Conflicts

C Jiayang, C Chan, Q Zhuang, L Qiu, T Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org

The rise of large language models (LLMs) has significantly influenced the quality of
information in decision-making systems, leading to the prevalence of AI-generated content …

บันทึก อ้างอิง บทความที่เกี่ยวข้อง ทั้งหมด 5 ฉบับ ดูในรูปแบบ HTML

สร้างการแจ้งเตือน

อ้างอิง

การค้นหาขั้นสูง

บันทึกไปยังคลังของฉันแล้ว

Backdoor removal for generative large language models

Backdoor attacks and defenses targeting multi-domain ai models: A comprehensive review

Privacy in large language models: Attacks, defenses and future directions

Backdoorllm: A comprehensive benchmark for backdoor attacks on large language models

Cleangen: Mitigating backdoor attacks for generation tasks in large language models

Text-tuple-table: Towards information integration in text-to-table generation via global tuple extraction

Negotiationtom: A benchmark for stress-testing machine theory of mind on negotiation surrounding

Beear: Embedding-based adversarial removal of safety backdoors in instruction-tuned language models

Safety at Scale: A Comprehensive Survey of Large Model Safety

PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning

ECON: On the Detection and Resolution of Evidence Conflicts