Mobile edge intelligence for large language models: A contemporary survey
On-device large language models (LLMs), referring to running LLMs on edge devices, have
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …
Towards efficient generative large language model serving: A survey from algorithms to systems
In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …
Localvaluebench: A collaboratively built and extensible benchmark for evaluating localized value alignment and ethical safety in large language models
GI Meadows, NWL Lau, EA Susanto, CL Yu… - arxiv preprint arxiv …, 2024 - arxiv.org
The proliferation of large language models (LLMs) requires robust evaluation of their
alignment with local values and ethical standards, especially as existing benchmarks often …
alignment with local values and ethical standards, especially as existing benchmarks often …
A Review on Edge Large Language Models: Design, Execution, and Applications
Large language models (LLMs) have revolutionized natural language processing with their
exceptional capabilities. However, deploying LLMs on resource-constrained edge devices …
exceptional capabilities. However, deploying LLMs on resource-constrained edge devices …
Federated full-parameter tuning of billion-sized language models with communication cost under 18 kilobytes
Pre-trained large language models (LLMs) require fine-tuning to improve their
responsiveness to natural language instructions. Federated learning (FL) offers a way to …
responsiveness to natural language instructions. Federated learning (FL) offers a way to …
Llm for mobile: An initial roadmap
When mobile meets LLMs, mobile app users deserve to have more intelligent usage
experiences. For this to happen, we argue that there is a strong need to apply LLMs for the …
experiences. For this to happen, we argue that there is a strong need to apply LLMs for the …
Efficient training and inference: Techniques for large language models using llama
SR Cunningham, D Archambault, A Kung - Authorea Preprints, 2024 - techrxiv.org
To enhance the efficiency of language models, it would involve optimizing their training and
inference processes to reduce computational demands while maintaining high performance …
inference processes to reduce computational demands while maintaining high performance …
VB-LoRA: Extreme Parameter Efficient Fine-Tuning with Vector Banks
As the adoption of large language models increases and the need for per-user or per-task
model customization grows, the parameter-efficient fine-tuning (PEFT) methods, such as low …
model customization grows, the parameter-efficient fine-tuning (PEFT) methods, such as low …
AnyMatch--Efficient Zero-Shot Entity Matching with a Small Language Model
Entity matching (EM) is the problem of determining whether two records refer to same real-
world entity, which is crucial in data integration, eg, for product catalogs or address …
world entity, which is crucial in data integration, eg, for product catalogs or address …
An LPDDR-based CXL-PNM Platform for TCO-efficient Inference of Transformer-based Large Language Models
Transformer-based large language models (LLMs) such as Generative Pre-trained
Transformer (GPT) have become popular due to their remarkable performance across …
Transformer (GPT) have become popular due to their remarkable performance across …