- Academic Search

M Li, M Zhang, C Wang, M Li - Advances in Neural …, 2020 - proceedings.neurips.cc

Deep learning models are computationally intense, and implementations often have to be
highly optimized by experts or hardware vendors to be usable in practice. The DL compiler …

Enregistrer Citer Cité 32 fois Autres articles Les 6 versions Free GPT-4 Version HTML

[Free GPT-4]

[PDF] acm.org

Swift machine learning model serving scheduling: a region based reinforcement learning approach

H Qin, S Zawad, Y Zhou, L Yang, D Zhao… - Proceedings of the …, 2019 - dl.acm.org

The success of machine learning has prospered Machine-Learning-as-a-Service (MLaaS)-
deploying trained machine learning (ML) models in cloud to provide low latency inference …

Enregistrer Citer Cité 34 fois Autres articles Les 5 versions Free GPT-4

Know Your Enemy To Save Cloud Energy: Energy-Performance Characterization of Machine Learning Serving

J Yu, J Kim, E Seo - 2023 IEEE International Symposium on …, 2023 - ieeexplore.ieee.org

The proportion of machine learning (ML) inference in modern cloud workloads is rapidly
increasing, and graphic processing units (GPUs) are the most preferred computational …

Enregistrer Citer Cité 8 fois Autres articles Les 3 versions Free GPT-4

[Free GPT-4]

[PDF] acm.org

SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration

J Zhuang, Z Yang, S Ji, H Huang, AK Jones… - Proceedings of the …, 2024 - dl.acm.org

With the increase in the computation intensity of the chip, the mismatch between
computation layer shapes and the available computation resource significantly limits the …

Enregistrer Citer Cité 14 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] arxiv.org

Perseus: Characterizing performance and cost of multi-tenant serving for cnn models

M LeMay, S Li, T Guo - 2020 IEEE International Conference on …, 2020 - ieeexplore.ieee.org

Deep learning models are increasingly used for end-user applications, supporting both
novel features such as facial recognition, and traditional features, eg web search. To …

Enregistrer Citer Cité 29 fois Autres articles Les 7 versions Free GPT-4

[Free GPT-4]

[PDF] ieee.org

Reinforcement-learning-empowered MLaaS scheduling for serving intelligent internet of things

H Qin, S Zawad, Y Zhou, S Padhi… - IEEE Internet of Things …, 2020 - ieeexplore.ieee.org

Machine learning (ML) has been embedded in many Internet of Things (IoT) applications
(eg, smart home and autonomous driving). Yet it is often infeasible to deploy ML models on …

Enregistrer Citer Cité 19 fois Autres articles Les 4 versions Free GPT-4

[Free GPT-4]

[PDF] vldb.org

Parax: Boosting deep learning for big data analytics on many-core cpus

L Yin, Y Zhang, Z Zhang, Y Peng, P Zhao - Proceedings of the VLDB …, 2021 - dl.acm.org

Despite the fact that GPUs and accelerators are more efficient in deep learning (DL),
commercial clouds like Facebook and Amazon now heavily use CPUs in DL computation …

Enregistrer Citer Cité 11 fois Autres articles Les 6 versions Free GPT-4

FPGA-assisted Design Space Exploration of Parameterized AI Accelerators: A Quickloop Approach

K Inayat, FB Muslim, T Mahmood, J Chung - Journal of Systems …, 2024 - Elsevier

FPGAs facilitate prototy** and debug, and recently accelerate full-stack simulations due to
their rapid turnaround time (TAT). However, this TAT is restrictive in exhaustive design space …

Enregistrer Citer Autres articles Les 3 versions Free GPT-4

Efficient Text-to-Code Retrieval with Cascaded Fast and Slow Transformer Models

AD Gotmare, J Li, S Joty, SCH Hoi - Proceedings of the 31st ACM Joint …, 2023 - dl.acm.org

The goal of semantic code search or text-to-code search is to retrieve a semantically
relevant code snippet from an existing code database using a natural language query …

Enregistrer Citer Cité 3 fois Autres articles

[Free GPT-4]

[PDF] escholarship.org

Programming Abstractions & Systems for Autonomous Vehicles

S Kalra - 2024 - search.proquest.com

Abstract Autonomous Vehicles (AVs) have the potential to revolutionize transportation
through their significant safety, environmental and mobility benefits. However, despite their …

Enregistrer Citer Autres articles Les 2 versions Free GPT-4

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

Accelerating large scale deep learning inference through {DeepCPU} at microsoft

Adatune: Adaptive tensor program compilation made efficient

Swift machine learning model serving scheduling: a region based reinforcement learning approach

Know Your Enemy To Save Cloud Energy: Energy-Performance Characterization of Machine Learning Serving

SSR: Spatial Sequential Hybrid Architecture for Latency Throughput Tradeoff in Transformer Acceleration

Perseus: Characterizing performance and cost of multi-tenant serving for cnn models

Reinforcement-learning-empowered MLaaS scheduling for serving intelligent internet of things

Parax: Boosting deep learning for big data analytics on many-core cpus

FPGA-assisted Design Space Exploration of Parameterized AI Accelerators: A Quickloop Approach

Efficient Text-to-Code Retrieval with Cascaded Fast and Slow Transformer Models

Programming Abstractions & Systems for Autonomous Vehicles