Machine learning at the network edge: A survey

MGS Murshed, C Murphy, D Hou, N Khan… - ACM Computing …, 2021 - dl.acm.org
Resource-constrained IoT devices, such as sensors and actuators, have become ubiquitous
in recent years. This has led to the generation of large quantities of data in real-time, which …

AI on the edge: a comprehensive review

W Su, L Li, F Liu, M He, X Liang - Artificial Intelligence Review, 2022 - Springer
With the advent of the Internet of Everything, the proliferation of data has put a huge burden
on data centers and network bandwidth. To ease the pressure on data centers, edge …

Orca: A distributed serving system for {Transformer-Based} generative models

GI Yu, JS Jeong, GW Kim, S Kim, BG Chun - 16th USENIX Symposium …, 2022 - usenix.org
Large-scale Transformer-based models trained for generation tasks (eg, GPT-3) have
recently attracted huge interest, emphasizing the need for system support for serving models …

Oort: Efficient federated learning via guided participant selection

F Lai, X Zhu, HV Madhyastha… - 15th {USENIX} Symposium …, 2021 - usenix.org
Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that
enables in-situ model training and testing on edge data. Despite having the same end goals …

{INFaaS}: Automated model-less inference serving

F Romero, Q Li, NJ Yadwadkar… - 2021 USENIX Annual …, 2021 - usenix.org
Despite existing work in machine learning inference serving, ease-of-use and cost efficiency
remain challenges at large scales. Developers must manually search through thousands of …

SPINN: Synergistic progressive inference of neural networks over device and cloud

S Laskaridis, SI Venieris, M Almeida… - Proceedings of the 26th …, 2020 - dl.acm.org
Despite the soaring use of convolutional neural networks (CNNs) in mobile applications,
uniformly sustaining high-performance inference on mobile has been elusive due to the …

Serving {DNNs} like clockwork: Performance predictability from the bottom up

A Gujarati, R Karimi, S Alzayat, W Hao… - … USENIX Symposium on …, 2020 - usenix.org
Machine learning inference is becoming a core building block for interactive web
applications. As a result, the underlying model serving systems on which these applications …

Chameleon: scalable adaptation of video analytics

J Jiang, G Ananthanarayanan, P Bodik, S Sen… - Proceedings of the …, 2018 - dl.acm.org
Applying deep convolutional neural networks (NN) to video data at scale poses a substantial
systems challenge, as improving inference accuracy often requires a prohibitive cost in …

Reducto: On-camera filtering for resource-efficient real-time video analytics

Y Li, A Padmanabhan, P Zhao, Y Wang… - Proceedings of the …, 2020 - dl.acm.org
To cope with the high resource (network and compute) demands of real-time video analytics
pipelines, recent systems have relied on frame filtering. However, filtering has typically been …

Orbital edge computing: Nanosatellite constellations as a new class of computer system

B Denby, B Lucia - Proceedings of the Twenty-Fifth International …, 2020 - dl.acm.org
Advances in nanosatellite technology and a declining cost of access to space have fostered
an emergence of large constellations of sensor-equipped satellites in low-Earth orbit. Many …