Applications and techniques for fast machine learning in science

AMC Deiana, N Tran, J Agar, M Blott… - Frontiers in big …, 2022 - frontiersin.org
In this community review report, we discuss applications and techniques for fast machine
learning (ML) in science—the concept of integrating powerful ML methods into the real-time …

Muse: Text-to-image generation via masked generative transformers

H Chang, H Zhang, J Barber, AJ Maschinot… - arxiv preprint arxiv …, 2023 - arxiv.org
We present Muse, a text-to-image Transformer model that achieves state-of-the-art image
generation performance while being significantly more efficient than diffusion or …

Rlaif: Scaling reinforcement learning from human feedback with ai feedback

H Lee, S Phatale, H Mansoor, KR Lu, T Mesnard… - 2023 - openreview.net
Reinforcement learning from human feedback (RLHF) is an effective technique for aligning
large language models (LLMs) to human preferences, but gathering high-quality human …

Hard prompts made easy: Gradient-based discrete optimization for prompt tuning and discovery

Y Wen, N Jain, J Kirchenbauer… - Advances in …, 2023 - proceedings.neurips.cc
The strength of modern generative models lies in their ability to be controlled through
prompts. Hard prompts comprise interpretable words and tokens, and are typically hand …

Scaling instruction-finetuned language models

HW Chung, L Hou, S Longpre, B Zoph, Y Tay… - Journal of Machine …, 2024 - jmlr.org
Finetuning language models on a collection of datasets phrased as instructions has been
shown to improve model performance and generalization to unseen tasks. In this paper we …

Unified-io 2: Scaling autoregressive multimodal models with vision language audio and action

J Lu, C Clark, S Lee, Z Zhang… - Proceedings of the …, 2024 - openaccess.thecvf.com
We present Unified-IO 2 a multimodal and multi-skill unified model capable of following
novel instructions. Unified-IO 2 can use text images audio and/or videos as input and can …

Diffusion model alignment using direct preference optimization

B Wallace, M Dang, R Rafailov… - Proceedings of the …, 2024 - openaccess.thecvf.com
Large language models (LLMs) are fine-tuned using human comparison data with
Reinforcement Learning from Human Feedback (RLHF) methods to make them better …