[PDF][PDF] How johnny can persuade llms to jailbreak them: Rethinking persuasion to challenge ai safety by humanizing llms

Y Zeng, H Lin, J Zhang, D Yang, R Jia… - Proceedings of the 62nd …, 2024 - aclanthology.org
Most traditional AI safety research views models as machines and centers on algorithm-
focused attacks developed by security experts. As large language models (LLMs) become …

Olmo: Accelerating the science of language models

D Groeneveld, I Beltagy, P Walsh, A Bhagia… - ar** the increasing use of LLMs in scientific papers
W Liang, Y Zhang, Z Wu, H Lepp, W Ji, X Zhao… - arxiv preprint arxiv …, 2024 - arxiv.org
Scientific publishing lays the foundation of science by disseminating research findings,
fostering collaboration, encouraging reproducibility, and ensuring that scientific knowledge …

Bigcodebench: Benchmarking code generation with diverse function calls and complex instructions

TY Zhuo, MC Vu, J Chim, H Hu, W Yu… - arxiv preprint arxiv …, 2024 - arxiv.org
Task automation has been greatly empowered by the recent advances in Large Language
Models (LLMs) via Python code, where the tasks ranging from software engineering …

An archival perspective on pretraining data

MA Desai, IV Pasquetto, AZ Jacobs, D Card - Patterns, 2024 - cell.com
Alongside an explosion in research and development related to large language models,
there has been a concomitant rise in the creation of pretraining datasets—massive …

The responsible foundation model development cheatsheet: A review of tools & resources

S Longpre, S Biderman, A Albalak… - arxiv preprint arxiv …, 2024 - arxiv.org
Foundation model development attracts a rapidly expanding body of contributors, scientists,
and applications. To help shape responsible development practices, we introduce the …

No" zero-shot" without exponential data: Pretraining concept frequency determines multimodal model performance

V Udandarao, A Prabhu, A Ghosh… - The Thirty-eighth …, 2024 - openreview.net
Web-crawled pretraining datasets underlie the impressive" zero-shot" evaluation
performance of multimodal models, such as CLIP for classification and Stable-Diffusion for …

The bias amplification paradox in text-to-image generation

P Seshadri, S Singh, Y Elazar - arxiv preprint arxiv:2308.00755, 2023 - arxiv.org
Bias amplification is a phenomenon in which models exacerbate biases or stereotypes
present in the training data. In this paper, we study bias amplification in the text-to-image …