- Academic Search

S Pal, J Beaumont, DH Park… - … Symposium on High …, 2018 - ieeexplore.ieee.org

Sparse matrices are widely used in graph and data analytics, machine learning, engineering
and scientific applications. This paper describes and analyzes OuterSPACE, an accelerator …

Spara Citera Citerat av 313 Relaterade artiklar Alla 6 versionerna

[Free GPT-4]

[PDF] acm.org

Gamma: Leveraging Gustavson's algorithm to accelerate sparse matrix multiplication

G Zhang, N Attaluri, JS Emer, D Sanchez - Proceedings of the 26th ACM …, 2021 - dl.acm.org

Sparse matrix-sparse matrix multiplication (spMspM) is at the heart of a wide range of
scientific and machine learning applications. spMspM is inefficient on general-purpose …

Spara Citera Citerat av 125 Relaterade artiklar Alla 5 versionerna

[Free GPT-4]

[PDF] berkeley.edu

Co-designing accelerators and SoC interfaces using gem5-Aladdin

YS Shao, SL **, V Srinivasan, GY Wei… - 2016 49th Annual …, 2016 - ieeexplore.ieee.org

Increasing demand for power-efficient, high-performance computing has spurred a growing
number and diversity of hardware accelerators in mobile and server Systems on Chip …

Spara Citera Citerat av 219 Relaterade artiklar Alla 7 versionerna

[Free GPT-4]

[PDF] acm.org

Buffets: An efficient and composable storage idiom for explicit decoupled data orchestration

M Pellauer, YS Shao, J Clemons, N Crago… - Proceedings of the …, 2019 - dl.acm.org

Accelerators spend significant area and effort on custom on-chip buffering. Unfortunately,
these solutions are strongly tied to particular designs, hampering re-usability across other …

Spara Citera Citerat av 74 Relaterade artiklar Alla 13 versionerna

[Free GPT-4]

[PDF] toronto.edu

Zorua: A holistic approach to resource virtualization in GPUs

N Vijaykumar, K Hsieh, G Pekhimenko… - 2016 49th Annual …, 2016 - ieeexplore.ieee.org

This paper introduces a new resource virtualization framework, Zorua, that decouples the
programmer-specified resource usage of a GPU application from the actual allocation in the …

Spara Citera Citerat av 85 Relaterade artiklar Alla 27 versionerna

[Free GPT-4]

[PDF] acm.org

Capstan: A vector RDA for sparsity

A Rucker, M Vilim, T Zhao, Y Zhang… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

This paper proposes Capstan: a scalable, parallel-patterns-based, reconfigurable dataflow
accelerator (RDA) for sparse and dense tensor applications. Instead of designing for one …

Spara Citera Citerat av 40 Relaterade artiklar Alla 3 versionerna

[Free GPT-4]

[PDF] illinois.edu

Efficient GPU synchronization without scopes: Saying no to complex consistency models

MD Sinclair, J Alsop, SV Adve - … of the 48th International Symposium on …, 2015 - dl.acm.org

As GPUs have become increasingly general purpose, applications with more general
sharing patterns and fine-grained synchronization have started to emerge. Unfortunately …

Spara Citera Citerat av 94 Relaterade artiklar Alla 12 versionerna

[Free GPT-4]

[PDF] acm.org

SparseAdapt: Runtime control for sparse linear algebra on a reconfigurable accelerator

S Pal, A Amarnath, S Feng, M O'Boyle… - MICRO-54: 54th Annual …, 2021 - dl.acm.org

Dynamic adaptation is a post-silicon optimization technique that adapts the hardware to
workload phases. However, current adaptive approaches are oblivious to implicit phases …

Spara Citera Citerat av 20 Relaterade artiklar Alla 5 versionerna

[Free GPT-4]

[PDF] acm.org

Whirlpool: Improving dynamic cache management with static data classification

A Mukkara, N Beckmann, D Sanchez - ACM SIGARCH Computer …, 2016 - dl.acm.org

Cache hierarchies are increasingly non-uniform and difficult to manage. Several techniques,
such as scratchpads or reuse hints, use static information about how programs access data …

Spara Citera Citerat av 66 Relaterade artiklar Alla 11 versionerna

[Free GPT-4]

[PDF] arxiv.org

Morpheus: Extending the last level cache capacity in GPU systems using idle GPU core resources

S Darabi, M Sadrosadati, N Akbarzadeh… - 2022 55th IEEE/ACM …, 2022 - ieeexplore.ieee.org

Graphics Processing Units (GPUs) are widely-used accelerators for data-parallel
applications. In many GPU applications, GPU memory bandwidth bottlenecks performance …

Spara Citera Citerat av 17 Relaterade artiklar Alla 6 versionerna

Citera

Avancerad sökning

Har sparats i Mitt bibliotek

Outerspace: An outer product based sparse matrix multiplication accelerator

Gamma: Leveraging Gustavson's algorithm to accelerate sparse matrix multiplication

Co-designing accelerators and SoC interfaces using gem5-Aladdin

Buffets: An efficient and composable storage idiom for explicit decoupled data orchestration

Zorua: A holistic approach to resource virtualization in GPUs

Capstan: A vector RDA for sparsity

Efficient GPU synchronization without scopes: Saying no to complex consistency models

SparseAdapt: Runtime control for sparse linear algebra on a reconfigurable accelerator

Whirlpool: Improving dynamic cache management with static data classification

Morpheus: Extending the last level cache capacity in GPU systems using idle GPU core resources