Badclip: Dual-embedding guided backdoor attack on multimodal contrastive learning
While existing backdoor attacks have successfully infected multimodal contrastive learning
models such as CLIP they can be easily countered by specialized backdoor defenses for …
models such as CLIP they can be easily countered by specialized backdoor defenses for …
Cleanclip: Mitigating data poisoning attacks in multimodal contrastive learning
Multimodal contrastive pretraining has been used to train multimodal representation models,
such as CLIP, on large amounts of paired image-text data. However, previous studies have …
such as CLIP, on large amounts of paired image-text data. However, previous studies have …
Distribution preserving backdoor attack in self-supervised learning
Self-supervised learning is widely used in various domains for building foundation models. It
has been demonstrated to achieve state-of-the-art performance in a range of tasks. In the …
has been demonstrated to achieve state-of-the-art performance in a range of tasks. In the …
Towards reliable and efficient backdoor trigger inversion via decoupling benign features
Recent studies revealed that using third-party models may lead to backdoor threats, where
adversaries can maliciously manipulate model predictions based on backdoors implanted …
adversaries can maliciously manipulate model predictions based on backdoors implanted …
Django: Detecting trojans in object detection models via gaussian focus calibration
Object detection models are vulnerable to backdoor or trojan attacks, where an attacker can
inject malicious triggers into the model, leading to altered behavior during inference. As a …
inject malicious triggers into the model, leading to altered behavior during inference. As a …
Ssl-cleanse: Trojan detection and mitigation in self-supervised learning
Self-supervised learning (SSL) is a prevalent approach for encoding data representations.
Using a pre-trained SSL image encoder and subsequently training a downstream classifier …
Using a pre-trained SSL image encoder and subsequently training a downstream classifier …
Defenses in adversarial machine learning: A survey
Adversarial phenomenon has been widely observed in machine learning (ML) systems,
especially in those using deep neural networks, describing that ML systems may produce …
especially in those using deep neural networks, describing that ML systems may produce …
Lotus: Evasive and resilient backdoor attacks through sub-partitioning
Backdoor attack poses a significant security threat to Deep Learning applications. Existing
attacks are often not evasive to established backdoor detection techniques. This …
attacks are often not evasive to established backdoor detection techniques. This …
Open problems in machine unlearning for ai safety
As AI systems become more capable, widely deployed, and increasingly autonomous in
critical areas such as cybersecurity, biological research, and healthcare, ensuring their …
critical areas such as cybersecurity, biological research, and healthcare, ensuring their …
Trustworthy, responsible, and safe ai: A comprehensive architectural framework for ai safety with challenges and mitigations
AI Safety is an emerging area of critical importance to the safe adoption and deployment of
AI systems. With the rapid proliferation of AI and especially with the recent advancement of …
AI systems. With the rapid proliferation of AI and especially with the recent advancement of …