Security and privacy challenges of large language models: A survey
Large language models (LLMs) have demonstrated extraordinary capabilities and
contributed to multiple fields, such as generating and summarizing text, language …
contributed to multiple fields, such as generating and summarizing text, language …
[HTML][HTML] A survey on large language model (llm) security and privacy: The good, the bad, and the ugly
Abstract Large Language Models (LLMs), such as ChatGPT and Bard, have revolutionized
natural language understanding and generation. They possess deep language …
natural language understanding and generation. They possess deep language …
A survey of large language models
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …
Extracting training data from diffusion models
Image diffusion models such as DALL-E 2, Imagen, and Stable Diffusion have attracted
significant attention due to their ability to generate high-quality synthetic images. In this work …
significant attention due to their ability to generate high-quality synthetic images. In this work …
[PDF][PDF] DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models.
Abstract Generative Pre-trained Transformer (GPT) models have exhibited exciting progress
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
in their capabilities, capturing the interest of practitioners and the public alike. Yet, while the …
Scaling data-constrained language models
The current trend of scaling language models involves increasing both parameter count and
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
training dataset size. Extrapolating this trend suggests that training dataset size may soon be …
Holistic evaluation of language models
Language models (LMs) are becoming the foundation for almost all major language
technologies, but their capabilities, limitations, and risks are not well understood. We present …
technologies, but their capabilities, limitations, and risks are not well understood. We present …
Large language models struggle to learn long-tail knowledge
The Internet contains a wealth of knowledge—from the birthdays of historical figures to
tutorials on how to code—all of which may be learned by language models. However, while …
tutorials on how to code—all of which may be learned by language models. However, while …
Emergent and predictable memorization in large language models
Memorization, or the tendency of large language models (LLMs) to output entire sequences
from their training data verbatim, is a key concern for deploying language models. In …
from their training data verbatim, is a key concern for deploying language models. In …
Poisoning language models during instruction tuning
Instruction-tuned LMs such as ChatGPT, FLAN, and InstructGPT are finetuned on datasets
that contain user-submitted examples, eg, FLAN aggregates numerous open-source …
that contain user-submitted examples, eg, FLAN aggregates numerous open-source …