- Academic Search

D Zhang, N Jayasena, A Lyashevsky… - Proceedings of the 23rd …, 2014 - dl.acm.org

As computation becomes increasingly limited by data movement and energy consumption,
exploiting locality throughout the memory hierarchy becomes critical to continued …

Opslaan Citeren Geciteerd door 451 Verwante artikelen Alle 12 versies

[Free GPT-4]
[DeepSeek]

[PDF] escholarship.org

Locality exists in graph processing: Workload characterization on an ivy bridge server

S Beamer, K Asanovic… - 2015 IEEE International …, 2015 - ieeexplore.ieee.org

Graph processing is an increasingly important application domain and is typically
communication-bound. In this work, we analyze the performance characteristics of three …

Opslaan Citeren Geciteerd door 234 Verwante artikelen Alle 12 versies In bibliotheek zoeken

[Free GPT-4]
[DeepSeek]

[PDF] utoronto.ca

Modular routing design for chiplet-based systems

J Yin, Z Lin, O Kayiran, M Poremba… - 2018 ACM/IEEE 45th …, 2018 - ieeexplore.ieee.org

System-on-Chip (SoC) complexity and the increasing costs of silicon motivate the breaking
of an SoC into smaller" chiplets." A chiplet-based SoC design process has the promise to …

Opslaan Citeren Geciteerd door 144 Verwante artikelen Alle 10 versies

[Free GPT-4]
[DeepSeek]

[PDF] github.io

Alleviating irregularity in graph analytics acceleration: A hardware/software co-design approach

M Yan, X Hu, S Li, A Basak, H Li, X Ma… - Proceedings of the …, 2019 - dl.acm.org

Graph analytics is an emerging application which extracts insights by processing large
volumes of highly connected data, namely graphs. The parallel processing of graphs has …

Opslaan Citeren Geciteerd door 87 Verwante artikelen Alle 4 versies

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

A compiler for throughput optimization of graph algorithms on GPUs

S Pai, K **ali - Proceedings of the 2016 ACM SIGPLAN International …, 2016 - dl.acm.org

Writing high-performance GPU implementations of graph algorithms can be challenging. In
this paper, we argue that three optimizations called throughput optimizations are key to high …

Opslaan Citeren Geciteerd door 129 Verwante artikelen Alle 9 versies

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Crono: A benchmark suite for multithreaded graph algorithms executing on futuristic multicores

M Ahmad, F Hijaz, Q Shi, O Khan - 2015 IEEE International …, 2015 - ieeexplore.ieee.org

Algorithms operating on a graph setting are known to be highly irregular and unstructured.
This leads to workload imbalance and data locality challenge when these algorithms are …

Opslaan Citeren Geciteerd door 141 Verwante artikelen Alle 5 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Bandwidth-effective dram cache for gpu s with storage-class memory

J Hong, S Cho, G Park, W Yang… - … Symposium on High …, 2024 - ieeexplore.ieee.org

We propose overcoming the memory capacity limitation of GPUs with high-capacity Storage-
Class Memory (SCM) and DRAM cache. By significantly increasing the memory capacity …

Opslaan Citeren Geciteerd door 8 Verwante artikelen Alle 4 versies

[Free GPT-4]
[DeepSeek]

[PDF] archive.org

Graph processing on GPUs: Where are the bottlenecks?

Q Xu, H Jeon, M Annavaram - 2014 IEEE International …, 2014 - ieeexplore.ieee.org

Large graph processing is now a critical component of many data analytics. Graph
processing is used from social networking Web sites that provide context-aware services …

Opslaan Citeren Geciteerd door 120 Verwante artikelen Alle 3 versies

[Free GPT-4]
[DeepSeek]

[PDF] nsf.gov

Adaptive page migration for irregular data-intensive applications under gpu memory oversubscription

D Ganguly, Z Zhang, J Yang… - 2020 IEEE International …, 2020 - ieeexplore.ieee.org

Unified Memory in heterogeneous systems serves a wide range of applications. However,
limited capacity of the device memory becomes a first order performance bottleneck for data …

Opslaan Citeren Geciteerd door 61 Verwante artikelen Alle 6 versies

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Not all gpus are created equal: characterizing variability in large-scale, accelerator-rich systems

P Sinha, A Guliani, R Jain, B Tran… - … Conference for High …, 2022 - ieeexplore.ieee.org

Scientists are increasingly exploring and utilizing the massive parallelism of general-
purpose accelerators such as GPUs for scientific breakthroughs. As a result, datacenters …

Opslaan Citeren Geciteerd door 22 Verwante artikelen Alle 7 versies

Melding maken

Citeren

Geavanceerd zoeken

Opgeslagen in Mijn bibliotheek

Pannotia: Understanding irregular GPGPU graph applications

TOP-PIM: Throughput-oriented programmable processing in memory

Locality exists in graph processing: Workload characterization on an ivy bridge server

Modular routing design for chiplet-based systems

Alleviating irregularity in graph analytics acceleration: A hardware/software co-design approach

A compiler for throughput optimization of graph algorithms on GPUs

Crono: A benchmark suite for multithreaded graph algorithms executing on futuristic multicores

Bandwidth-effective dram cache for gpu s with storage-class memory

Graph processing on GPUs: Where are the bottlenecks?

Adaptive page migration for irregular data-intensive applications under gpu memory oversubscription

Not all gpus are created equal: characterizing variability in large-scale, accelerator-rich systems