A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

[HTML][HTML] A survey of GPT-3 family large language models including ChatGPT and GPT-4

KS Kalyan - Natural Language Processing Journal, 2024 - Elsevier
Large language models (LLMs) are a special class of pretrained language models (PLMs)
obtained by scaling model size, pretraining corpus and computation. LLMs, because of their …

[PDF][PDF] A survey of large language models

WX Zhao, K Zhou, J Li, T Tang… - arxiv preprint arxiv …, 2023 - paper-notes.zhjwpku.com
Ever since the Turing Test was proposed in the 1950s, humans have explored the mastering
of language intelligence by machine. Language is essentially a complex, intricate system of …

[PDF][PDF] Mamba: Linear-time sequence modeling with selective state spaces

A Gu, T Dao - arxiv preprint arxiv:2312.00752, 2023 - minjiazhang.github.io
Foundation models, now powering most of the exciting applications in deep learning, are
almost universally based on the Transformer architecture and its core attention module …

Generative AI at work

E Brynjolfsson, D Li, L Raymond - The Quarterly Journal of …, 2025 - academic.oup.com
We study the staggered introduction of a generative AI–based conversational assistant
using data from 5,172 customer-support agents. Access to AI assistance increases worker …

[HTML][HTML] Explainable Artificial Intelligence (XAI): What we know and what is left to attain Trustworthy Artificial Intelligence

S Ali, T Abuhmed, S El-Sappagh, K Muhammad… - Information fusion, 2023 - Elsevier
Artificial intelligence (AI) is currently being utilized in a wide range of sophisticated
applications, but the outcomes of many AI models are challenging to comprehend and trust …

Scaling up gans for text-to-image synthesis

M Kang, JY Zhu, R Zhang, J Park… - Proceedings of the …, 2023 - openaccess.thecvf.com
The recent success of text-to-image synthesis has taken the world by storm and captured the
general public's imagination. From a technical standpoint, it also marked a drastic change in …

Multimodal foundation models: From specialists to general-purpose assistants

C Li, Z Gan, Z Yang, J Yang, L Li… - … and Trends® in …, 2024 - nowpublishers.com
Neural compression is the application of neural networks and other machine learning
methods to data compression. Recent advances in statistical machine learning have opened …

Transformers are ssms: Generalized models and efficient algorithms through structured state space duality

T Dao, A Gu - arxiv preprint arxiv:2405.21060, 2024 - arxiv.org
While Transformers have been the main architecture behind deep learning's success in
language modeling, state-space models (SSMs) such as Mamba have recently been shown …

Benchmarking large language models for news summarization

T Zhang, F Ladhak, E Durmus, P Liang… - Transactions of the …, 2024 - direct.mit.edu
Large language models (LLMs) have shown promise for automatic summarization but the
reasons behind their successes are poorly understood. By conducting a human evaluation …