The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Ladder: Enabling Efficient {Low-Precision} Deep Learning Computing through Hardware-aware Tensor Transformation

L Wang, L Ma, S Cao, Q Zhang, J Xue, Y Shi… - … USENIX Symposium on …, 2024 - usenix.org
The increasing demand for improving deep learning model performance has led to a
paradigm shift in supporting low-precision computation to harness the robustness of deep …

Language models scale reliably with over-training and on downstream tasks

SY Gadre, G Smyrnis, V Shankar, S Gururangan… - arxiv preprint arxiv …, 2024 - arxiv.org
Scaling laws are useful guides for derisking expensive training runs, as they predict
performance of large models using cheaper, small-scale experiments. However, there …

Dart-math: Difficulty-aware rejection tuning for mathematical problem-solving

Y Tong, X Zhang, R Wang, R Wu, J He - arxiv preprint arxiv:2407.13690, 2024 - arxiv.org
Solving mathematical problems requires advanced reasoning abilities and presents notable
challenges for large language models. Previous works usually synthesize data from …

Allo: A programming model for composable accelerator design

H Chen, N Zhang, S **ang, Z Zeng, M Dai… - Proceedings of the ACM …, 2024 - dl.acm.org
Special-purpose hardware accelerators are increasingly pivotal for sustaining performance
improvements in emerging applications, especially as the benefits of technology scaling …

Smart parallel automated cryo-electron tomography

F Eisenstein, Y Fukuda, R Danev - Nature Methods, 2024 - nature.com
In situ cryo-electron tomography enables investigation of macromolecules in their native
cellular environment. Samples have become more readily available owing to recent …

NVILA: Efficient frontier visual language models

Z Liu, L Zhu, B Shi, Z Zhang, Y Lou, S Yang… - arxiv preprint arxiv …, 2024 - arxiv.org
Visual language models (VLMs) have made significant advances in accuracy in recent
years. However, their efficiency has received much less attention. This paper introduces …

[HTML][HTML] Advancing state of health estimation for electric vehicles: Transformer-based approach leveraging real-world data

K Nakano, S Vögler, K Tanaka - Advances in Applied Energy, 2024 - Elsevier
The widespread adoption of electric vehicles (EVs) underscores the urgent need for
innovative approaches to estimate their lithium-ion batteries' state of health (SOH), which is …

Liger kernel: Efficient triton kernels for llm training

PL Hsu, Y Dai, V Kothapalli, Q Song, S Tang… - arxiv preprint arxiv …, 2024 - arxiv.org
Training Large Language Models (LLMs) efficiently at scale presents a formidable
challenge, driven by their ever-increasing computational demands and the need for …

Smarter, better, faster, longer: A modern bidirectional encoder for fast, memory efficient, and long context finetuning and inference

B Warner, A Chaffin, B Clavié, O Weller… - arxiv preprint arxiv …, 2024 - arxiv.org
Encoder-only transformer models such as BERT offer a great performance-size tradeoff for
retrieval and classification tasks with respect to larger decoder-only models. Despite being …