„Google“ mokslinčius

Išsaugoti Cituoti Cituoja 387 Susiję straipsniai Visos 9 versijos HTML kopija

Orca: A distributed serving system for {Transformer-Based} generative models

GI Yu, JS Jeong, GW Kim, S Kim, BG Chun - 16th USENIX Symposium …, 2022 - usenix.org

Large-scale Transformer-based models trained for generation tasks (eg, GPT-3) have
recently attracted huge interest, emphasizing the need for system support for serving models …

Išsaugoti Cituoti Cituoja 485 Susiję straipsniai Visos 18 versijos HTML kopija

Oort: Efficient federated learning via guided participant selection

F Lai, X Zhu, HV Madhyastha… - 15th {USENIX} Symposium …, 2021 - usenix.org

Federated Learning (FL) is an emerging direction in distributed machine learning (ML) that
enables in-situ model training and testing on edge data. Despite having the same end goals …

Išsaugoti Cituoti Cituoja 250 Susiję straipsniai Visos 4 versijos HTML kopija

{INFaaS}: Automated model-less inference serving

F Romero, Q Li, NJ Yadwadkar… - 2021 USENIX Annual …, 2021 - usenix.org

Despite existing work in machine learning inference serving, ease-of-use and cost efficiency
remain challenges at large scales. Developers must manually search through thousands of …

Išsaugoti Cituoti Cituoja 314 Susiję straipsniai Visos 6 versijos

SPINN: Synergistic progressive inference of neural networks over device and cloud

S Laskaridis, SI Venieris, M Almeida… - Proceedings of the 26th …, 2020 - dl.acm.org

Despite the soaring use of convolutional neural networks (CNNs) in mobile applications,
uniformly sustaining high-performance inference on mobile has been elusive due to the …

Išsaugoti Cituoti Cituoja 306 Susiję straipsniai Visos 14 versijos HTML kopija

Serving {DNNs} like clockwork: Performance predictability from the bottom up

A Gujarati, R Karimi, S Alzayat, W Hao… - … USENIX Symposium on …, 2020 - usenix.org

Machine learning inference is becoming a core building block for interactive web
applications. As a result, the underlying model serving systems on which these applications …

Išsaugoti Cituoti Cituoja 536 Susiję straipsniai Visos 2 versijos

Chameleon: scalable adaptation of video analytics

J Jiang, G Ananthanarayanan, P Bodik, S Sen… - Proceedings of the …, 2018 - dl.acm.org

Applying deep convolutional neural networks (NN) to video data at scale poses a substantial
systems challenge, as improving inference accuracy often requires a prohibitive cost in …

Išsaugoti Cituoti Cituoja 265 Susiję straipsniai Visos 8 versijos

Reducto: On-camera filtering for resource-efficient real-time video analytics

Y Li, A Padmanabhan, P Zhao, Y Wang… - Proceedings of the …, 2020 - dl.acm.org

To cope with the high resource (network and compute) demands of real-time video analytics
pipelines, recent systems have relied on frame filtering. However, filtering has typically been …