Acceptable Use Policies for Foundation Models

K Klyman - Proceedings of the AAAI/ACM Conference on AI, Ethics …, 2024 - ojs.aaai.org
As foundation models have accumulated hundreds of millions of users, developers have
begun to take steps to prevent harmful types of uses. One salient intervention that foundation …

The Reality of AI and Biorisk

A Peppin, A Reuel, S Casper, E Jones, A Strait… - arxiv preprint arxiv …, 2024 - arxiv.org
To accurately and confidently answer the question'could an AI model or system increase
biorisk', it is necessary to have both a sound theoretical threat model for how AI models or …

Compliance Cards: Automated EU AI Act Compliance Analyses amidst a Complex AI Supply Chain

B Marino, Y Chaudhary, Y Pi, RJ Yew… - arxiv preprint arxiv …, 2024 - arxiv.org
As the AI supply chain grows more complex, AI systems and models are increasingly likely
to incorporate multiple internally-or externally-sourced components such as datasets and …

Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach

TT Wang, J Hughes, H Sleight, R Schaeffer… - arxiv preprint arxiv …, 2024 - arxiv.org
Defending large language models against jailbreaks so that they never engage in a broadly-
defined set of forbidden behaviors is an open problem. In this paper, we investigate the …

Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers

TT Wang, J Hughes, H Sleight, R Schaeffer… - The Third Workshop on … - openreview.net
Defending large language models against jailbreaks so that they never engage in a broad
set of forbidden behaviors is an open problem. In this paper, we study if jailbreak-defense is …