Acceptable Use Policies for Foundation Models
K Klyman - Proceedings of the AAAI/ACM Conference on AI, Ethics …, 2024 - ojs.aaai.org
As foundation models have accumulated hundreds of millions of users, developers have
begun to take steps to prevent harmful types of uses. One salient intervention that foundation …
begun to take steps to prevent harmful types of uses. One salient intervention that foundation …
The Reality of AI and Biorisk
To accurately and confidently answer the question'could an AI model or system increase
biorisk', it is necessary to have both a sound theoretical threat model for how AI models or …
biorisk', it is necessary to have both a sound theoretical threat model for how AI models or …
Compliance Cards: Automated EU AI Act Compliance Analyses amidst a Complex AI Supply Chain
As the AI supply chain grows more complex, AI systems and models are increasingly likely
to incorporate multiple internally-or externally-sourced components such as datasets and …
to incorporate multiple internally-or externally-sourced components such as datasets and …
Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach
Defending large language models against jailbreaks so that they never engage in a broadly-
defined set of forbidden behaviors is an open problem. In this paper, we investigate the …
defined set of forbidden behaviors is an open problem. In this paper, we investigate the …
Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers
Defending large language models against jailbreaks so that they never engage in a broad
set of forbidden behaviors is an open problem. In this paper, we study if jailbreak-defense is …
set of forbidden behaviors is an open problem. In this paper, we study if jailbreak-defense is …