- Academic Search

D Zhuang, X Zhang, S Song… - Proceedings of Machine …, 2022 - proceedings.mlsys.org

The quest for determinism in machine learning has disproportionately focused on
characterizing the impact of noise introduced by algorithmic design choices. In this work, we …

Spara Citera Citerat av 94 Relaterade artiklar Alla 6 versionerna Se som HTML-version

[Free GPT-4]

[PDF] iitd.ac.in

Demystifying tensorrt: Characterizing neural network inference engine on nvidia edge devices

O Shafi, C Rai, R Sen… - 2021 IEEE International …, 2021 - ieeexplore.ieee.org

Edge devices are seeing tremendous growth in sensing and computational capabilities.
Running state-of-the-art deep neural network (NN) based data processing on multi-core …

Spara Citera Citerat av 54 Relaterade artiklar Alla 3 versionerna

[Free GPT-4]

[PDF] acm.org

A software-defined tensor streaming multiprocessor for large-scale machine learning

D Abts, G Kimmell, A Ling, J Kim, M Boyd… - Proceedings of the 49th …, 2022 - dl.acm.org

We describe our novel commercial software-defined approach for large-scale
interconnection networks of tensor streaming processing (TSP) elements. The system …

Spara Citera Citerat av 41 Relaterade artiklar Alla 9 versionerna

[Free GPT-4]

[PDF] arxiv.org

Not all gpus are created equal: characterizing variability in large-scale, accelerator-rich systems

P Sinha, A Guliani, R Jain, B Tran… - … Conference for High …, 2022 - ieeexplore.ieee.org

Scientists are increasingly exploring and utilizing the massive parallelism of general-
purpose accelerators such as GPUs for scientific breakthroughs. As a result, datacenters …

Spara Citera Citerat av 23 Relaterade artiklar Alla 6 versionerna

[Free GPT-4]

[PDF] arxiv.org

Universal checkpointing: Efficient and flexible checkpointing for large scale distributed training

X Lian, SA Jacobs, L Kurilenko, M Tanaka… - arxiv preprint arxiv …, 2024 - arxiv.org

Existing checkpointing approaches seem ill-suited for distributed training even though
hardware limitations make model parallelism, ie, sharding model state across multiple …

Spara Citera Citerat av 6 Relaterade artiklar Alla 3 versionerna Se som HTML-version

[Free GPT-4]

[PDF] arxiv.org

Reproducibility of machine learning: Terminology, recommendations and open issues

R Albertoni, S Colantonio, P Skrzypczyński… - arxiv preprint arxiv …, 2023 - arxiv.org

Reproducibility is one of the core dimensions that concur to deliver Trustworthy Artificial
Intelligence. Broadly speaking, reproducibility can be defined as the possibility to reproduce …

Spara Citera Citerat av 18 Relaterade artiklar Alla 5 versionerna Se som HTML-version

[Free GPT-4]

[PDF] openreview.net

On The Fairness Impacts of Hardware Selection in Machine Learning

SH Nelaturu, NK Ravichandran, C Tran… - … on Machine Learning, 2023 - openreview.net

In the machine learning ecosystem, hardware selection is often regarded as a mere utility,
overshadowed by the spotlight on algorithms and data. This is especially relevant in …

Spara Citera Citerat av 2 Relaterade artiklar Alla 2 versionerna Se som HTML-version

[Free GPT-4]

[PDF] arxiv.org

DISTWAR: Fast Differentiable Rendering on Raster-based Rendering Pipelines

S Durvasula, A Zhao, F Chen, R Liang… - arxiv preprint arxiv …, 2023 - arxiv.org

Differentiable rendering is a technique used in an important emerging class of visual
computing applications that involves representing a 3D scene as a model that is trained from …

Spara Citera Citerat av 13 Relaterade artiklar Alla 2 versionerna Se som HTML-version

Only buffer when you need to: Reducing on-chip gpu traffic with reconfigurable local atomic buffers

P Dalmia, R Mahapatra… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org

In recent years, due to their wide availability and ease of programming, GPUs have emerged
as the accelerator of choice for a wide variety of applications including graph analytics and …

Spara Citera Citerat av 6 Relaterade artiklar Alla 2 versionerna

[Free GPT-4]

[PDF] arxiv.org

Optimistic Verifiable Training by Controlling Hardware Nondeterminism

M Srivastava, S Arora, D Boneh - arxiv preprint arxiv:2403.09603, 2024 - arxiv.org

The increasing compute demands of AI systems has led to the emergence of services that
train models on behalf of clients lacking necessary resources. However, ensuring …

Spara Citera Citerat av 4 Relaterade artiklar Alla 2 versionerna Se som HTML-version

Skapa alarm

Citera

Avancerad sökning

Har sparats i Mitt bibliotek

Deterministic atomic buffering

Randomness in neural network training: Characterizing the impact of tooling

Demystifying tensorrt: Characterizing neural network inference engine on nvidia edge devices

A software-defined tensor streaming multiprocessor for large-scale machine learning

Not all gpus are created equal: characterizing variability in large-scale, accelerator-rich systems

Universal checkpointing: Efficient and flexible checkpointing for large scale distributed training

Reproducibility of machine learning: Terminology, recommendations and open issues

On The Fairness Impacts of Hardware Selection in Machine Learning

DISTWAR: Fast Differentiable Rendering on Raster-based Rendering Pipelines

Only buffer when you need to: Reducing on-chip gpu traffic with reconfigurable local atomic buffers

Optimistic Verifiable Training by Controlling Hardware Nondeterminism