The rise and potential of large language model based agents: A survey

Z **, W Chen, X Guo, W He, Y Ding, B Hong… - Science China …, 2025‏ - Springer
For a long time, researchers have sought artificial intelligence (AI) that matches or exceeds
human intelligence. AI agents, which are artificial entities capable of sensing the …

A comprehensive overview of large language models

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

G Team, P Georgiev, VI Lei, R Burnell, L Bai… - arxiv preprint arxiv …, 2024‏ - arxiv.org
In this report, we introduce the Gemini 1.5 family of models, representing the next generation
of highly compute-efficient multimodal models capable of recalling and reasoning over fine …

Harnessing the power of llms in practice: A survey on chatgpt and beyond

J Yang, H **, R Tang, X Han, Q Feng, H Jiang… - ACM Transactions on …, 2024‏ - dl.acm.org
This article presents a comprehensive and practical guide for practitioners and end-users
working with Large Language Models (LLMs) in their downstream Natural Language …

Challenges and applications of large language models

J Kaddour, J Harris, M Mozes, H Bradley… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Large Language Models (LLMs) went from non-existent to ubiquitous in the machine
learning discourse within a few years. Due to the fast pace of the field, it is difficult to identify …

Rwkv: Reinventing rnns for the transformer era

B Peng, E Alcaide, Q Anthony, A Albalak… - arxiv preprint arxiv …, 2023‏ - arxiv.org
Transformers have revolutionized almost all natural language processing (NLP) tasks but
suffer from memory and computational complexity that scales quadratically with sequence …

Flashattention-3: Fast and accurate attention with asynchrony and low-precision

J Shah, G Bikshandi, Y Zhang… - Advances in …, 2025‏ - proceedings.neurips.cc
Attention, as a core layer of the ubiquitous Transformer architecture, is the bottleneck for
large language models and long-context applications. elaborated an approach to speed up …

[PDF][PDF] Large language models: a comprehensive survey of its applications, challenges, limitations, and future prospects

MU Hadi, R Qureshi, A Shah, M Irfan, A Zafar… - Authorea …, 2023‏ - researchgate.net
Within the vast expanse of computerized language processing, a revolutionary entity known
as Large Language Models (LLMs) has emerged, wielding immense power in its capacity to …

Ul2: Unifying language learning paradigms

Y Tay, M Dehghani, VQ Tran, X Garcia, J Wei… - arxiv preprint arxiv …, 2022‏ - arxiv.org
Existing pre-trained models are generally geared towards a particular class of problems. To
date, there seems to be still no consensus on what the right architecture and pre-training …

Language modeling is compression

G Delétang, A Ruoss, PA Duquenne, E Catt… - arxiv preprint arxiv …, 2023‏ - arxiv.org
It has long been established that predictive models can be transformed into lossless
compressors and vice versa. Incidentally, in recent years, the machine learning community …