- Academic Search

G Qu, Q Chen, W Wei, Z Lin, X Chen… - … Surveys & Tutorials, 2025 - ieeexplore.ieee.org

On-device large language models (LLMs), referring to running LLMs on edge devices, have
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …

Save Cite Cited by 27 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Towards efficient generative large language model serving: A survey from algorithms to systems

X Miao, G Oliaro, Z Zhang, X Cheng, H **… - arxiv preprint arxiv …, 2023 - arxiv.org

In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …

Save Cite Cited by 71 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Localvaluebench: A collaboratively built and extensible benchmark for evaluating localized value alignment and ethical safety in large language models

GI Meadows, NWL Lau, EA Susanto, CL Yu… - arxiv preprint arxiv …, 2024 - arxiv.org

The proliferation of large language models (LLMs) requires robust evaluation of their
alignment with local values and ethical standards, especially as existing benchmarks often …

Save Cite Cited by 39 Related articles View as HTML

[Free GPT-4]

[PDF] arxiv.org

A Review on Edge Large Language Models: Design, Execution, and Applications

Y Zheng, Y Chen, B Qian, X Shi, Y Shu… - arxiv preprint arxiv …, 2024 - arxiv.org

Large language models (LLMs) have revolutionized natural language processing with their
exceptional capabilities. However, deploying LLMs on resource-constrained edge devices …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

Federated full-parameter tuning of billion-sized language models with communication cost under 18 kilobytes

Z Qin, D Chen, B Qian, B Ding, Y Li, S Deng - arxiv preprint arxiv …, 2023 - arxiv.org

Pre-trained large language models (LLMs) require fine-tuning to improve their
responsiveness to natural language instructions. Federated learning (FL) offers a way to …

Save Cite Cited by 30 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] acm.org

Llm for mobile: An initial roadmap

D Chen, Y Liu, M Zhou, Y Zhao, H Wang… - ACM Transactions on …, 2024 - dl.acm.org

When mobile meets LLMs, mobile app users deserve to have more intelligent usage
experiences. For this to happen, we argue that there is a strong need to apply LLMs for the …

Save Cite Cited by 4 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] techrxiv.org

Efficient training and inference: Techniques for large language models using llama

SR Cunningham, D Archambault, A Kung - Authorea Preprints, 2024 - techrxiv.org

To enhance the efficiency of language models, it would involve optimizing their training and
inference processes to reduce computational demands while maintaining high performance …

Save Cite Cited by 63 Related articles All 3 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks

Y Li, S Han, S Ji - arxiv preprint arxiv:2405.15179, 2024 - arxiv.org

As the adoption of large language models increases and the need for per-user or per-task
model customization grows, the parameter-efficient fine-tuning (PEFT) methods, such as low …

Save Cite Cited by 4 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] arxiv.org

AnyMatch--Efficient Zero-Shot Entity Matching with a Small Language Model

Z Zhang, P Groth, I Calixto, S Schelter - arxiv preprint arxiv:2409.04073, 2024 - arxiv.org

Entity matching (EM) is the problem of determining whether two records refer to same real-
world entity, which is crucial in data integration, eg, for product catalogs or address …

Save Cite Cited by 3 Related articles All 3 versions Free GPT-4 View as HTML

An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models

SS Park, KS Kim, J So, J Jung, J Lee… - … Symposium on High …, 2024 - ieeexplore.ieee.org

Transformer-based large language models (LLMs) such as Generative Pre-trained
Transformer (GPT) have become popular due to their remarkable performance across …

Save Cite Cited by 7 Related articles All 3 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Distributed inference and fine-tuning of large language models over the internet

Mobile edge intelligence for large language models: A contemporary survey

Towards efficient generative large language model serving: A survey from algorithms to systems

Localvaluebench: A collaboratively built and extensible benchmark for evaluating localized value alignment and ethical safety in large language models

A Review on Edge Large Language Models: Design, Execution, and Applications

Federated full-parameter tuning of billion-sized language models with communication cost under 18 kilobytes

Llm for mobile: An initial roadmap

Efficient training and inference: Techniques for large language models using llama

VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks

AnyMatch--Efficient Zero-Shot Entity Matching with a Small Language Model

An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models