Panacea: Mitigating Harmful Fine-tuning for Large Language Models via Post-fine-tuning Perturbation
Harmful fine-tuning attack introduces significant security risks to the fine-tuning services.
Mainstream defenses aim to vaccinate the model such that the later harmful fine-tuning …
Mainstream defenses aim to vaccinate the model such that the later harmful fine-tuning …
Defending LVLMs Against Vision Attacks through Partial-Perception Supervision
Q Zhou, T Li, Q Guo, D Wang, Y Lin, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Recent studies have raised significant concerns regarding the vulnerability of Large Vision
Language Models (LVLMs) to maliciously injected or perturbed input images, which can …
Language Models (LVLMs) to maliciously injected or perturbed input images, which can …