Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Mobile edge intelligence for large language models: A contemporary survey
On-device large language models (LLMs), referring to running LLMs on edge devices, have
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …
raised considerable interest since they are more cost-effective, latency-efficient, and privacy …
Foundation models in smart agriculture: Basics, opportunities, and challenges
The past decade has witnessed the rapid development and adoption of machine and deep
learning (ML & DL) methodologies in agricultural systems, showcased by great successes in …
learning (ML & DL) methodologies in agricultural systems, showcased by great successes in …
A survey on model compression for large language models
Abstract Large Language Models (LLMs) have transformed natural language processing
tasks successfully. Yet, their large size and high computational needs pose challenges for …
tasks successfully. Yet, their large size and high computational needs pose challenges for …
Kvquant: Towards 10 million context length llm inference with kv cache quantization
LLMs are seeing growing use for applications which require large context windows, and with
these large context windows KV cache activations surface as the dominant contributor to …
these large context windows KV cache activations surface as the dominant contributor to …
Omniquant: Omnidirectionally calibrated quantization for large language models
Large language models (LLMs) have revolutionized natural language processing tasks.
However, their practical deployment is hindered by their immense memory and computation …
However, their practical deployment is hindered by their immense memory and computation …
Medusa: Simple llm inference acceleration framework with multiple decoding heads
Large Language Models (LLMs) employ auto-regressive decoding that requires sequential
computation, with each step reliant on the previous one's output. This creates a bottleneck …
computation, with each step reliant on the previous one's output. This creates a bottleneck …
Mobilellm: Optimizing sub-billion parameter language models for on-device use cases
This paper addresses the growing need for efficient large language models (LLMs) on
mobile devices, driven by increasing cloud costs and latency concerns. We focus on …
mobile devices, driven by increasing cloud costs and latency concerns. We focus on …
A survey of resource-efficient llm and multimodal foundation models
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
Kivi: A tuning-free asymmetric 2bit quantization for kv cache
Efficiently serving large language models (LLMs) requires batching of many requests to
reduce the cost per request. Yet, with larger batch sizes and longer context lengths, the key …
reduce the cost per request. Yet, with larger batch sizes and longer context lengths, the key …
Towards efficient generative large language model serving: A survey from algorithms to systems
In the rapidly evolving landscape of artificial intelligence (AI), generative large language
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …
models (LLMs) stand at the forefront, revolutionizing how we interact with our data. However …