Security and privacy challenges of large language models: A survey
Large language models (LLMs) have demonstrated extraordinary capabilities and
contributed to multiple fields, such as generating and summarizing text, language …
contributed to multiple fields, such as generating and summarizing text, language …
Tool learning with foundation models
Humans possess an extraordinary ability to create and utilize tools. With the advent of
foundation models, artificial intelligence systems have the potential to be equally adept in …
foundation models, artificial intelligence systems have the potential to be equally adept in …
[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
Revisiting out-of-distribution robustness in nlp: Benchmarks, analysis, and LLMs evaluations
This paper reexamines the research on out-of-distribution (OOD) robustness in the field of
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
NLP. We find that the distribution shift settings in previous studies commonly lack adequate …
Privacy in large language models: Attacks, defenses and future directions
The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …
effectively tackle various downstream NLP tasks and unify these tasks into generative …
Backdooring instruction-tuned large language models with virtual prompt injection
Abstract Instruction-tuned Large Language Models (LLMs) have become a ubiquitous
platform for open-ended applications due to their ability to modulate responses based on …
platform for open-ended applications due to their ability to modulate responses based on …
Plmmark: a secure and robust black-box watermarking framework for pre-trained language models
The huge training overhead, considerable commercial value, and various potential security
risks make it urgent to protect the intellectual property (IP) of Deep Neural Networks (DNNs) …
risks make it urgent to protect the intellectual property (IP) of Deep Neural Networks (DNNs) …
Attention-enhancing backdoor attacks against bert-based models
Recent studies have revealed that\textit {Backdoor Attacks} can threaten the safety of natural
language processing (NLP) models. Investigating the strategies of backdoor attacks will help …
language processing (NLP) models. Investigating the strategies of backdoor attacks will help …
Representation in AI evaluations
Calls for representation in artificial intelligence (AI) and machine learning (ML) are
widespread, with" representation" or" representativeness" generally understood to be both …
widespread, with" representation" or" representativeness" generally understood to be both …
Setting the trap: Capturing and defeating backdoors in pretrained language models through honeypots
In the field of natural language processing, the prevalent approach involves fine-tuning
pretrained language models (PLMs) using local samples. Recent research has exposed the …
pretrained language models (PLMs) using local samples. Recent research has exposed the …