Edge learning: The enabling technology for distributed big data analytics in the edge

J Zhang, Z Qu, C Chen, H Wang, Y Zhan, B Ye… - ACM Computing …, 2021 - dl.acm.org
Machine Learning (ML) has demonstrated great promise in various fields, eg, self-driving,
smart city, which are fundamentally altering the way individuals and organizations live, work …

Batch: Machine learning inference serving on serverless platforms with adaptive batching

A Ali, R Pinciroli, F Yan, E Smirni - … International Conference for …, 2020 - ieeexplore.ieee.org
Serverless computing is a new pay-per-use cloud service paradigm that automates resource
scaling for stateless functions and can potentially facilitate bursty machine learning serving …

{NeuGraph}: Parallel deep neural network computation on large graphs

L Ma, Z Yang, Y Miao, J Xue, M Wu, L Zhou… - 2019 USENIX Annual …, 2019 - usenix.org
Recent deep learning models have moved beyond low dimensional regular grids such as
image, video, and speech, to high-dimensional graph-structured data, such as social …

Optimizing dynamic neural networks with brainstorm

W Cui, Z Han, L Ouyang, Y Wang, N Zheng… - … USENIX Symposium on …, 2023 - usenix.org
Dynamic neural networks (NNs), which can adapt sparsely activated sub-networks to inputs
during inference, have shown significant advantages over static ones in terms of accuracy …

Nimble: Efficiently compiling dynamic neural networks for model inference

H Shen, J Roesch, Z Chen, W Chen… - Proceedings of …, 2021 - proceedings.mlsys.org
Modern deep neural networks increasingly make use of features such as control flow,
dynamic data structures, and dynamic tensor shapes. Existing deep learning systems focus …

Self-aware neural network systems: A survey and new perspective

Z Du, Q Guo, Y Zhao, T Zhi, Y Chen… - Proceedings of the …, 2020 - ieeexplore.ieee.org
Neural network (NN) processors are specially designed to handle deep learning tasks by
utilizing multilayer artificial NNs. They have been demonstrated to be useful in broad …

Elastictrainer: Speeding up on-device training with runtime elastic tensor selection

K Huang, B Yang, W Gao - Proceedings of the 21st Annual International …, 2023 - dl.acm.org
On-device training is essential for neural networks (NNs) to continuously adapt to new
online data, but can be time-consuming due to the device's limited computing power. To …

SoD2: Statically Optimizing Dynamic Deep Neural Network Execution

W Niu, G Agrawal, B Ren - Proceedings of the 29th ACM International …, 2024 - dl.acm.org
Though many compilation and runtime systems have been developed for DNNs in recent
years, the focus has largely been on static DNNs. Dynamic DNNs, where tensor shapes and …

Enabling Large Dynamic Neural Network Training with Learning-based Memory Management

J Ren, D Xu, S Yang, J Zhao, Z Li… - … Symposium on High …, 2024 - ieeexplore.ieee.org
Dynamic neural network (DyNN) enables high computational efficiency and strong
representation capability. However, training DyNN can face a memory capacity problem …

Cortex: A compiler for recursive deep learning models

P Fegade, T Chen, P Gibbons… - Proceedings of Machine …, 2021 - proceedings.mlsys.org
Optimizing deep learning models is generally performed in two steps:(i) high-level graph
optimizations such as kernel fusion and (ii) low level kernel optimizations such as those …