Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Foundation Models Defining a New Era in Vision: a Survey and Outlook

M Awais, M Naseer, S Khan, RM Anwer… - … on Pattern Analysis …, 2025‏ - ieeexplore.ieee.org
Vision systems that see and reason about the compositional nature of visual scenes are
fundamental to understanding our world. The complex relations between objects and their …

Improved baselines with visual instruction tuning

H Liu, C Li, Y Li, YJ Lee - … of the IEEE/CVF Conference on …, 2024‏ - openaccess.thecvf.com
Large multimodal models (LMM) have recently shown encouraging progress with visual
instruction tuning. In this paper we present the first systematic study to investigate the design …

Universal and transferable adversarial attacks on aligned language models

A Zou, Z Wang, N Carlini, M Nasr, JZ Kolter… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Because" out-of-the-box" large language models are capable of generating a great deal of
objectionable content, recent work has focused on aligning these models in an attempt to …

Open problems and fundamental limitations of reinforcement learning from human feedback

S Casper, X Davies, C Shi, TK Gilbert… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Reinforcement learning from human feedback (RLHF) is a technique for training AI systems
to align with human goals. RLHF has emerged as the central method used to finetune state …

Fine-tuning aligned language models compromises safety, even when users do not intend to!

X Qi, Y Zeng, T **e, PY Chen, R Jia, P Mittal… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Optimizing large language models (LLMs) for downstream use cases often involves the
customization of pre-trained LLMs through further fine-tuning. Meta's open release of Llama …

Explainability for large language models: A survey

H Zhao, H Chen, F Yang, N Liu, H Deng, H Cai… - ACM Transactions on …, 2024‏ - dl.acm.org
Large language models (LLMs) have demonstrated impressive capabilities in natural
language processing. However, their internal mechanisms are still unclear and this lack of …

Foundational challenges in assuring alignment and safety of large language models

U Anwar, A Saparov, J Rando, D Paleka… - arxiv preprint arxiv …, 2024‏ - arxiv.org
This work identifies 18 foundational challenges in assuring the alignment and safety of large
language models (LLMs). These challenges are organized into three different categories …

Scalable extraction of training data from (production) language models

M Nasr, N Carlini, J Hayase, M Jagielski… - arxiv preprint arxiv …, 2023‏ - arxiv.org
This paper studies extractable memorization: training data that an adversary can efficiently
extract by querying a machine learning model without prior knowledge of the training …

Catastrophic jailbreak of open-source llms via exploiting generation

Y Huang, S Gupta, M **a, K Li, D Chen - arxiv preprint arxiv:2310.06987, 2023‏ - arxiv.org
The rapid progress in open-source large language models (LLMs) is significantly advancing
AI development. Extensive efforts have been made before model release to align their …