[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.

B Wang, W Chen, H Pei, C **e, M Kang, C Zhang, C Xu… - NeurIPS, 2023 - blogs.qub.ac.uk
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …

Unique security and privacy threats of large language model: A comprehensive survey

S Wang, T Zhu, B Liu, M Ding, X Guo, D Ye… - arxiv preprint arxiv …, 2024 - arxiv.org
With the rapid development of artificial intelligence, large language models (LLMs) have
made remarkable advancements in natural language processing. These models are trained …

Differentially private natural language models: Recent advances and future directions

L Hu, I Habernal, L Shen, D Wang - arxiv preprint arxiv:2301.09112, 2023 - arxiv.org
Recent developments in deep learning have led to great success in various natural
language processing (NLP) tasks. However, these applications may involve data that …

Privacy in large language models: Attacks, defenses and future directions

H Li, Y Chen, J Luo, J Wang, H Peng, Y Kang… - arxiv preprint arxiv …, 2023 - arxiv.org
The advancement of large language models (LLMs) has significantly enhanced the ability to
effectively tackle various downstream NLP tasks and unify these tasks into generative …

Llm-pbe: Assessing data privacy in large language models

Q Li, J Hong, C **e, J Tan, R **n, J Hou, X Yin… - arxiv preprint arxiv …, 2024 - arxiv.org
Large Language Models (LLMs) have become integral to numerous domains, significantly
advancing applications in data management, mining, and analysis. Their profound …

Ring-A-Bell! How Reliable are Concept Removal Methods for Diffusion Models?

YL Tsai, CY Hsu, C **e, CH Lin, JY Chen, B Li… - arxiv preprint arxiv …, 2023 - arxiv.org
Diffusion models for text-to-image (T2I) synthesis, such as Stable Diffusion (SD), have
recently demonstrated exceptional capabilities for generating high-quality content. However …

Risk taxonomy, mitigation, and assessment benchmarks of large language model systems

T Cui, Y Wang, C Fu, Y **ao, S Li, X Deng, Y Liu… - arxiv preprint arxiv …, 2024 - arxiv.org
Large language models (LLMs) have strong capabilities in solving diverse natural language
processing tasks. However, the safety and security issues of LLM systems have become the …

Privacy-preserving in-context learning with differentially private few-shot generation

X Tang, R Shin, HA Inan, A Manoel… - arxiv preprint arxiv …, 2023 - arxiv.org
We study the problem of in-context learning (ICL) with large language models (LLMs) on
private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the …

Dp-opt: Make large language model your privacy-preserving prompt engineer

J Hong, JT Wang, C Zhang, Z Li, B Li… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have emerged as dominant tools for various tasks,
particularly when tailored for a specific target by prompt tuning. Nevertheless, concerns …

Privlm-bench: A multi-level privacy evaluation benchmark for language models

H Li, D Guo, D Li, W Fan, Q Hu, X Liu… - Proceedings of the …, 2024 - aclanthology.org
The rapid development of language models (LMs) brings unprecedented accessibility and
usage for both models and users. On the one hand, powerful LMs achieve state-of-the-art …