[PDF][PDF] Language model behavior: A comprehensive survey

TA Chang, BK Bergen - Computational Linguistics, 2024 - direct.mit.edu
Transformer language models have received widespread public attention, yet their
generated text is often surprising even to NLP researchers. In this survey, we discuss over …

[HTML][HTML] When brain-inspired ai meets agi

L Zhao, L Zhang, Z Wu, Y Chen, H Dai, X Yu, Z Liu… - Meta-Radiology, 2023 - Elsevier
Abstract Artificial General Intelligence (AGI) has been a long-standing goal of humanity, with
the aim of creating machines capable of performing any intellectual task that humans can …

Gpt-4 technical report

J Achiam, S Adler, S Agarwal, L Ahmad… - ar**s. We investigate two setups-ICL with flipped labels and ICL with …

Managing extreme AI risks amid rapid progress

Y Bengio, G Hinton, A Yao, D Song, P Abbeel, T Darrell… - Science, 2024 - science.org
Artificial intelligence (AI) is progressing rapidly, and companies are shifting their focus to
develo** generalist AI systems that can autonomously act and pursue goals. Increases in …

Exploiting programmatic behavior of llms: Dual-use through standard security attacks

D Kang, X Li, I Stoica, C Guestrin… - 2024 IEEE Security …, 2024 - ieeexplore.ieee.org
Recent advances in instruction-following large language models (LLMs) have led to
dramatic improvements in a range of NLP tasks. Unfortunately, we find that the same …

Red teaming language models to reduce harms: Methods, scaling behaviors, and lessons learned

D Ganguli, L Lovitt, J Kernion, A Askell, Y Bai… - arxiv preprint arxiv …, 2022 - arxiv.org
We describe our early efforts to red team language models in order to simultaneously
discover, measure, and attempt to reduce their potentially harmful outputs. We make three …