- Academic Search

K Klyman - Proceedings of the AAAI/ACM Conference on AI, Ethics …, 2024 - ojs.aaai.org

As foundation models have accumulated hundreds of millions of users, developers have
begun to take steps to prevent harmful types of uses. One salient intervention that foundation …

Simpan Kutip Dirujuk 3 kali Artikel terkait 6 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

The Reality of AI and Biorisk

A Peppin, A Reuel, S Casper, E Jones, A Strait… - arxiv preprint arxiv …, 2024 - arxiv.org

To accurately and confidently answer the question'could an AI model or system increase
biorisk', it is necessary to have both a sound theoretical threat model for how AI models or …

Simpan Kutip Artikel terkait 3 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Compliance Cards: Automated EU AI Act Compliance Analyses amidst a Complex AI Supply Chain

B Marino, Y Chaudhary, Y Pi, RJ Yew… - arxiv preprint arxiv …, 2024 - arxiv.org

As the AI supply chain grows more complex, AI systems and models are increasingly likely
to incorporate multiple internally-or externally-sourced components such as datasets and …

Simpan Kutip Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach

TT Wang, J Hughes, H Sleight, R Schaeffer… - arxiv preprint arxiv …, 2024 - arxiv.org

Defending large language models against jailbreaks so that they never engage in a broadly-
defined set of forbidden behaviors is an open problem. In this paper, we investigate the …

Simpan Kutip Artikel terkait 2 versi Versi HTML

[Free GPT-4]
[DeepSeek]

[PDF] openreview.net

Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers

TT Wang, J Hughes, H Sleight, R Schaeffer… - The Third Workshop on … - openreview.net

Defending large language models against jailbreaks so that they never engage in a broad
set of forbidden behaviors is an open problem. In this paper, we study if jailbreak-defense is …

Simpan Kutip Artikel terkait 2 versi Versi HTML

Buat notifikasi

Kutip

Penelusuran lanjutan

Disimpan ke Koleksi saya

Open problems in technical ai governance, 2024

Acceptable Use Policies for Foundation Models

The Reality of AI and Biorisk

Compliance Cards: Automated EU AI Act Compliance Analyses amidst a Complex AI Supply Chain

Jailbreak Defense in a Narrow Domain: Limitations of Existing Methods and a New Transcript-Classifier Approach

Jailbreak Defense in a Narrow Domain: Failures of existing methods and Improving Transcript-Based Classifiers