A survey of safety and trustworthiness of deep neural networks: Verification, testing, adversarial attack and defence, and interpretability
In the past few years, significant progress has been made on deep neural networks (DNNs)
in achieving human-level performance on several long-standing tasks. With the broader …
in achieving human-level performance on several long-standing tasks. With the broader …
Backdoor attacks and countermeasures on deep learning: A comprehensive review
This work provides the community with a timely comprehensive review of backdoor attacks
and countermeasures on deep learning. According to the attacker's capability and affected …
and countermeasures on deep learning. According to the attacker's capability and affected …
Universal and transferable adversarial attacks on aligned language models
Because" out-of-the-box" large language models are capable of generating a great deal of
objectionable content, recent work has focused on aligning these models in an attempt to …
objectionable content, recent work has focused on aligning these models in an attempt to …
Weight poisoning attacks on pre-trained models
Recently, NLP has seen a surge in the usage of large pre-trained models. Users download
weights of models pre-trained on large datasets, then fine-tune the weights on a task of their …
weights of models pre-trained on large datasets, then fine-tune the weights on a task of their …
Adversarial deepfakes: Evaluating vulnerability of deepfake detectors to adversarial examples
Recent advances in video manipulation techniques have made the generation of fake
videos more accessible than ever before. Manipulated videos can fuel disinformation and …
videos more accessible than ever before. Manipulated videos can fuel disinformation and …
Advpulse: Universal, synchronization-free, and targeted audio adversarial attacks via subsecond perturbations
Existing efforts in audio adversarial attacks only focus on the scenarios where an adversary
has prior knowledge of the entire speech input so as to generate an adversarial example by …
has prior knowledge of the entire speech input so as to generate an adversarial example by …
A survey on universal adversarial attack
The intriguing phenomenon of adversarial examples has attracted significant attention in
machine learning and what might be more surprising to the community is the existence of …
machine learning and what might be more surprising to the community is the existence of …
A survey on voice assistant security: Attacks and countermeasures
Voice assistants (VA) have become prevalent on a wide range of personal devices such as
smartphones and smart speakers. As companies build voice assistants with extra …
smartphones and smart speakers. As companies build voice assistants with extra …
Adversarial threats to deepfake detection: A practical perspective
Facially manipulated images and videos or DeepFakes can be used maliciously to fuel
misinformation or defame individuals. Therefore, detecting DeepFakes is crucial to increase …
misinformation or defame individuals. Therefore, detecting DeepFakes is crucial to increase …
Data-free universal adversarial perturbation and black-box attack
Universal adversarial perturbation (UAP), ie a single perturbation to fool the network for most
images, is widely recognized as a more practical attack because the UAP can be generated …
images, is widely recognized as a more practical attack because the UAP can be generated …