- Academic Search

H Naveed, AU Khan, S Qiu, M Saqib, S Anwar… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have recently demonstrated remarkable capabilities in
natural language processing tasks and beyond. This success of LLMs has led to a large …

Save Cite Cited by 693 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Mobile edge intelligence for large language models: A contemporary survey

G Qu, Q Chen, W Wei, Z Lin, X Chen… - … Surveys & Tutorials, 2025 - ieeexplore.ieee.org

On-device large language models (LLMs), referring to running LLMs on edge devices, have
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …

Save Cite Cited by 26 Related articles All 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

A survey of large language models

WX Zhao, K Zhou, J Li, T Tang, X Wang, Y Hou… - arxiv preprint arxiv …, 2023 - arxiv.org

Language is essentially a complex, intricate system of human expressions governed by
grammatical rules. It poses a significant challenge to develop capable AI algorithms for …

Save Cite Cited by 3540 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] mit.edu

A survey on model compression for large language models

X Zhu, J Li, Y Liu, C Ma, W Wang - Transactions of the Association for …, 2024 - direct.mit.edu

Abstract Large Language Models (LLMs) have transformed natural language processing
tasks successfully. Yet, their large size and high computational needs pose challenges for …

Save Cite Cited by 225 Related articles All 2 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Efficient large language models: A survey

Z Wan, X Wang, C Liu, S Alam, Y Zheng, J Liu… - arxiv preprint arxiv …, 2023 - arxiv.org

Large Language Models (LLMs) have demonstrated remarkable capabilities in important
tasks such as natural language understanding and language generation, and thus have the …

Save Cite Cited by 125 Related articles All 7 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Qa-lora: Quantization-aware low-rank adaptation of large language models

Y Xu, L **e, X Gu, X Chen, H Chang, H Zhang… - arxiv preprint arxiv …, 2023 - arxiv.org

Recently years have witnessed a rapid development of large language models (LLMs).
Despite the strong ability in many language-understanding tasks, the heavy computational …

Save Cite Cited by 127 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Omniquant: Omnidirectionally calibrated quantization for large language models

W Shao, M Chen, Z Zhang, P Xu, L Zhao, Z Li… - arxiv preprint arxiv …, 2023 - arxiv.org

Large language models (LLMs) have revolutionized natural language processing tasks.
However, their practical deployment is hindered by their immense memory and computation …

Save Cite Cited by 179 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Loftq: Lora-fine-tuning-aware quantization for large language models

Y Li, Y Yu, C Liang, P He, N Karampatziakis… - arxiv preprint arxiv …, 2023 - arxiv.org

Quantization is an indispensable technique for serving Large Language Models (LLMs) and
has recently found its way into LoRA fine-tuning. In this work we focus on the scenario where …

Save Cite Cited by 139 Related articles All 4 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Llm inference unveiled: Survey and roofline model insights

Z Yuan, Y Shang, Y Zhou, Z Dong, Z Zhou… - arxiv preprint arxiv …, 2024 - arxiv.org

The field of efficient Large Language Model (LLM) inference is rapidly evolving, presenting a
unique blend of opportunities and challenges. Although the field has expanded and is …

Save Cite Cited by 56 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Spikegpt: Generative pre-trained language model with spiking neural networks

RJ Zhu, Q Zhao, G Li, JK Eshraghian - arxiv preprint arxiv:2302.13939, 2023 - arxiv.org

As the size of large language models continue to scale, so does the computational
resources required to run it. Spiking Neural Networks (SNNs) have emerged as an energy …

Save Cite Cited by 113 Related articles All 3 versions Free GPT-4 View as HTML

Create alert

Cite

Advanced search

Saved to My library

Llm-qat: Data-free quantization aware training for large language models

A comprehensive overview of large language models

Mobile edge intelligence for large language models: A contemporary survey

A survey of large language models

A survey on model compression for large language models

Efficient large language models: A survey

Qa-lora: Quantization-aware low-rank adaptation of large language models

Omniquant: Omnidirectionally calibrated quantization for large language models

Loftq: Lora-fine-tuning-aware quantization for large language models

Llm inference unveiled: Survey and roofline model insights

Spikegpt: Generative pre-trained language model with spiking neural networks