Turnitin
降AI改写
早检测系统
早降重系统
Turnitin-UK版
万方检测-期刊版
维普编辑部版
Grammarly检测
Paperpass检测
checkpass检测
PaperYY检测
Enabling resource-efficient aiot system with cross-level optimization: A survey
The emerging field of artificial intelligence of things (AIoT, AI+ IoT) is driven by the
widespread use of intelligent infrastructures and the impressive success of deep learning …
widespread use of intelligent infrastructures and the impressive success of deep learning …
A survey on spatio-temporal big data analytics ecosystem: Resource management, processing platform, and applications
With the rapid evolution of the Internet, Internet of Things (IoT), and geographic information
systems (GIS), spatio-temporal Big Data (STBD) is experiencing exponential growth …
systems (GIS), spatio-temporal Big Data (STBD) is experiencing exponential growth …
vPipe: A Virtualized Acceleration System for Achieving Efficient and Scalable Pipeline Parallel DNN Training
The increasing computational complexity of DNNs achieved unprecedented successes in
various areas such as machine vision and natural language processing (NLP), eg, the …
various areas such as machine vision and natural language processing (NLP), eg, the …
Fine-tuning giant neural networks on commodity hardware with automatic pipeline model parallelism
Fine-tuning is an increasingly common technique that leverages transfer learning to
dramatically expedite the training of huge, high-quality models. Critically, fine-tuning holds …
dramatically expedite the training of huge, high-quality models. Critically, fine-tuning holds …
Tsplit: Fine-grained gpu memory management for efficient dnn training via tensor splitting
Since Deep Neural Networks (DNNs) are deeper and larger, performing DNNs training on
existing accelerators (eg, GPUs) is challenging due to their limited device memory capacity …
existing accelerators (eg, GPUs) is challenging due to their limited device memory capacity …
Democratizing AI: Open-source Scalable LLM Training on GPU-based Supercomputers
S Singh, P Singhania, A Ranjan… - … Conference for High …, 2024 - ieeexplore.ieee.org
Training and fine-tuning large language models (LLMs) with hundreds of billions to trillions
of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In …
of parameters requires tens of thousands of GPUs, and a highly scalable software stack. In …
MegTaiChi: Dynamic tensor-based memory management optimization for DNN training
Z Hu, J **ao, Z Deng, M Li, K Zhang, X Zhang… - Proceedings of the 36th …, 2022 - dl.acm.org
In real applications, it is common to train deep neural networks (DNNs) on modest clusters.
With the continuous increase of model size and batch size, the training of DNNs becomes …
With the continuous increase of model size and batch size, the training of DNNs becomes …
FedDCT: Federated learning of large convolutional neural networks on resource-constrained devices using divide and collaborative training
In Federated Learning (FL), the size of local models matters. On the one hand, it is logical to
use large-capacity neural networks in pursuit of high performance. On the other hand, deep …
use large-capacity neural networks in pursuit of high performance. On the other hand, deep …
An oracle for guiding large-scale model/hybrid parallel training of convolutional neural networks
Deep Neural Network (DNN) frameworks use distributed training to enable faster time to
convergence and alleviate memory capacity limitations when training large models and/or …
convergence and alleviate memory capacity limitations when training large models and/or …
PERKS: a locality-optimized execution model for iterative memory-bound GPU applications
Iterative memory-bound solvers commonly occur in HPC codes. Typical GPU
implementations have a loop on the host side that invokes the GPU kernel as much as …
implementations have a loop on the host side that invokes the GPU kernel as much as …