Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
A survey of resource-efficient llm and multimodal foundation models
Large foundation models, including large language models (LLMs), vision transformers
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
(ViTs), diffusion, and LLM-based multimodal models, are revolutionizing the entire machine …
A survey on scheduling techniques in computing and network convergence
S Tang, Y Yu, H Wang, G Wang, W Chen… - … Surveys & Tutorials, 2023 - ieeexplore.ieee.org
The computing demand for massive applications has led to the ubiquitous deployment of
computing power. This trend results in the urgent need for higher-level computing resource …
computing power. This trend results in the urgent need for higher-level computing resource …
Resource-efficient algorithms and systems of foundation models: A survey
Large foundation models, including large language models, vision transformers, diffusion,
and large language model based multimodal models, are revolutionizing the entire machine …
and large language model based multimodal models, are revolutionizing the entire machine …
Efficient training of large language models on distributed infrastructures: a survey
Large Language Models (LLMs) like GPT and LLaMA are revolutionizing the AI industry with
their sophisticated capabilities. Training these models requires vast GPU clusters and …
their sophisticated capabilities. Training these models requires vast GPU clusters and …
Tale of two cs: Computation vs. communication scaling for future transformers on future hardware
Scaling neural network models has delivered dramatic quality gains across ML problems.
However, this scaling also increased the reliance on efficient distributed training techniques …
However, this scaling also increased the reliance on efficient distributed training techniques …
Training and serving system of foundation models: A comprehensive survey
Foundation models (eg, ChatGPT, DALL-E, PengCheng Mind, PanGu-) have demonstrated
extraordinary performance in key technological areas, such as natural language processing …
extraordinary performance in key technological areas, such as natural language processing …
Fast state restoration in LLM serving with hcache
The growing complexity of LLM usage today, eg, multi-round conversation and retrieval-
augmented generation (RAG), makes contextual states (ie, KV cache) reusable across user …
augmented generation (RAG), makes contextual states (ie, KV cache) reusable across user …
Liger: Interleaving Intra-and Inter-Operator Parallelism for Distributed Large Model Inference
Distributed large model inference is still in a dilemma where balancing cost and effect. The
online scenarios demand intraoperator parallelism to achieve low latency and intensive …
online scenarios demand intraoperator parallelism to achieve low latency and intensive …
Exploring the performance and efficiency of transformer models for NLP on mobile devices
Deep learning (DL) is characterised by its dynamic nature, with new deep neural network
(DNN) architectures and approaches emerging every few years, driving the field's …
(DNN) architectures and approaches emerging every few years, driving the field's …
ProTrain: Efficient LLM Training via Memory-Aware Techniques
It is extremely memory-hungry to train Large Language Models (LLM). To solve this problem,
existing work exploits the combination of CPU and GPU for the training process, such as …
existing work exploits the combination of CPU and GPU for the training process, such as …