A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - ar** language-image pre-training with frozen image encoders and large language models
J Li, D Li, S Savarese, S Hoi - International conference on …, 2023 - proceedings.mlr.press
The cost of vision-and-language pre-training has become increasingly prohibitive due to
end-to-end training of large-scale models. This paper proposes BLIP-2, a generic and …

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org
Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

The llama 3 herd of models

A Dubey, A Jauhri, A Pandey, A Kadian… - arxiv preprint arxiv …, 2024 - arxiv.org
Modern artificial intelligence (AI) systems are powered by foundation models. This paper
presents a new set of foundation models, called Llama 3. It is a herd of language models …

Image as a foreign language: Beit pretraining for vision and vision-language tasks

W Wang, H Bao, L Dong, J Bjorck… - Proceedings of the …, 2023 - openaccess.thecvf.com
A big convergence of language, vision, and multimodal pretraining is emerging. In this work,
we introduce a general-purpose multimodal foundation model BEiT-3, which achieves …

Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing

P Liu, W Yuan, J Fu, Z Jiang, H Hayashi… - ACM Computing …, 2023 - dl.acm.org
This article surveys and organizes research works in a new paradigm in natural language
processing, which we dub “prompt-based learning.” Unlike traditional supervised learning …

A comprehensive survey on pretrained foundation models: A history from bert to chatgpt

C Zhou, Q Li, C Li, J Yu, Y Liu, G Wang… - International Journal of …, 2024 - Springer
Abstract Pretrained Foundation Models (PFMs) are regarded as the foundation for various
downstream tasks across different data modalities. A PFM (eg, BERT, ChatGPT, GPT-4) is …

Codet5: Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation

Y Wang, W Wang, S Joty, SCH Hoi - arxiv preprint arxiv:2109.00859, 2021 - arxiv.org
Pre-trained models for Natural Languages (NL) like BERT and GPT have been recently
shown to transfer well to Programming Languages (PL) and largely benefit a broad set of …

A survey on vision transformer

K Han, Y Wang, H Chen, X Chen, J Guo… - IEEE transactions on …, 2022 - ieeexplore.ieee.org
Transformer, first applied to the field of natural language processing, is a type of deep neural
network mainly based on the self-attention mechanism. Thanks to its strong representation …