A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

A review on large language models: Architectures, applications, taxonomies, open issues and challenges

MAK Raiaan, MSH Mukta, K Fatema, NM Fahad… - IEEE …, 2024 - ieeexplore.ieee.org
Large Language Models (LLMs) recently demonstrated extraordinary capability in various
natural language processing (NLP) tasks including language translation, text generation …

Qwen technical report

J Bai, S Bai, Y Chu, Z Cui, K Dang, X Deng… - arxiv preprint arxiv …, 2023 - arxiv.org
Large language models (LLMs) have revolutionized the field of artificial intelligence,
enabling natural language processing tasks that were previously thought to be exclusive to …

[PDF][PDF] A survey of large language models

WX Zhao, K Zhou, J Li, T Tang… - arxiv preprint arxiv …, 2023 - paper-notes.zhjwpku.com
Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …

[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces

A Gu, T Dao - arxiv preprint arxiv:2312.00752, 2023 - minjiazhang.github.io
Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

Dinov2: Learning robust visual features without supervision

M Oquab, T Darcet, T Moutakanni, H Vo… - arxiv preprint arxiv …, 2023 - arxiv.org
The recent breakthroughs in natural language processing for model pretraining on large
quantities of data have opened the way for similar foundation models in computer vision …

Cogvlm: Visual expert for pretrained language models

W Wang, Q Lv, W Yu, W Hong, J Qi… - Advances in …, 2025 - proceedings.neurips.cc
We introduce CogVLM, a powerful open-source visual language foundation model. Different
from the popular\emph {shallow alignment} method which maps image features into the …

mplug-owl2: Revolutionizing multi-modal large language model with modality collaboration

Q Ye, H Xu, J Ye, M Yan, A Hu, H Liu… - Proceedings of the …, 2024 - openaccess.thecvf.com
Abstract Multi-modal Large Language Models (MLLMs) have demonstrated impressive
instruction abilities across various open-ended tasks. However previous methods have …

Yi: Open foundation models by 01. ai

A Young, B Chen, C Li, C Huang, G Zhang… - arxiv preprint arxiv …, 2024 - arxiv.org
We introduce the Yi model family, a series of language and multimodal models that
demonstrate strong multi-dimensional capabilities. The Yi model family is based on 6B and …

[PDF][PDF] The era of 1-bit llms: All large language models are in 1.58 bits

S Ma, H Wang, L Ma, L Wang… - arxiv preprint …, 2024 - storage.prod.researchhub.com
Recent research, such as BitNet [WMD+ 23], is paving the way for a new era of 1-bit Large
Language Models (LLMs). In this work, we introduce a 1-bit LLM variant, namely BitNet b1 …