- Academic Search

RS Wahby, Y Ji, AJ Blumberg, A Shelat… - Proceedings of the …, 2017 - dl.acm.org

Systems for verifiable outsourcing incur costs for a prover, a verifier, and precomputation;
outsourcing makes sense when the combination of these costs is cheaper than not …

Enregistrer Citer Cité 100 fois Autres articles Les 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Symphony: Orchestrating sparse and dense tensors with hierarchical heterogeneous processing

M Pellauer, J Clemons, V Balaji, N Crago… - ACM Transactions on …, 2023 - dl.acm.org

Sparse tensor algorithms are becoming widespread, particularly in the domains of deep
learning, graph and data analytics, and scientific computing. Current high-performance …

Enregistrer Citer Cité 8 fois Autres articles Les 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] mit.edu

T4: Compiling sequential code for effective speculative parallelization in hardware

VA Ying, MC Jeffrey, D Sanchez - 2020 ACM/IEEE 47th Annual …, 2020 - ieeexplore.ieee.org

Multicores are now ubiquitous, but programmers still write sequential code. Speculative
parallelization is an enticing approach to parallelize code while retaining the ease of …

Enregistrer Citer Cité 35 fois Autres articles Les 13 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] northwestern.edu

HELIX-UP: Relaxing program semantics to unleash parallelization

S Campanoni, G Holloway, GY Wei… - 2015 IEEE/ACM …, 2015 - ieeexplore.ieee.org

Automatic generation of parallel code for general-purpose commodity processors is a
challenging computational problem. Nevertheless, there is a lot of latent thread-level …

Enregistrer Citer Cité 62 fois Autres articles Les 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] arxiv.org

Inter-thread communication in multithreaded, reconfigurable coarse-grain arrays

D Voitsechov, O Port, Y Etsion - 2018 51st Annual IEEE/ACM …, 2018 - ieeexplore.ieee.org

Traditional von Neumann GPGPUs only allow threads to communicate through memory on a
group-to-group basis. In this model, a group of producer threads writes intermediate values …

Enregistrer Citer Cité 39 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org Full View

Predicting new workload or CPU performance by analyzing public datasets

Y Wang, V Lee, GY Wei, D Brooks - ACM Transactions on Architecture …, 2019 - dl.acm.org

The marketplace for general-purpose microprocessors offers hundreds of functionally similar
models, differing by traits like frequency, core count, cache size, memory bandwidth, and …

Enregistrer Citer Cité 36 fois Autres articles Les 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] nsf.gov

Phloem: Automatic acceleration of irregular applications with fine-grain pipeline parallelism

QM Nguyen, D Sanchez - 2023 IEEE International Symposium …, 2023 - ieeexplore.ieee.org

Irregular applications are increasingly common in diverse domains, like graph analytics and
sparse linear algebra. Accelerating these applications is challenging because of their …

Enregistrer Citer Cité 9 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Trireme: Exploration of hierarchical multi-level parallelism for hardware acceleration

G Zacharopoulos, A Ejjeh, Y **g, EY Yang… - ACM Transactions on …, 2023 - dl.acm.org

The design of heterogeneous systems that include domain specific accelerators is a
challenging and time-consuming process. While taking into account area constraints …

Enregistrer Citer Cité 8 fois Autres articles Les 4 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

CARAT CAKE: Replacing paging via compiler/kernel cooperation

B Suchy, S Ghosh, D Kersnar, S Chai… - Proceedings of the 27th …, 2022 - dl.acm.org

Virtual memory, specifically paging, is undergoing significant innovation due to being
challenged by new demands from modern workloads. Recent work has demonstrated an …

Enregistrer Citer Cité 14 fois Autres articles Les 8 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] acm.org

Cooperative caching for GPUs

S Dublish, V Nagarajan, N Topham - ACM Transactions on Architecture …, 2016 - dl.acm.org

The rise of general-purpose computing on GPUs has influenced architectural innovation on
them. The introduction of an on-chip cache hierarchy is one such innovation. High L1 miss …

Enregistrer Citer Cité 35 fois Autres articles Les 5 versions Free GPT-4 DeepSeek

Créer l'alerte

Citer

Recherche avancée

Enregistré dans Ma bibliothèque

HELIX-RC: An architecture-compiler co-design for automatic parallelization of irregular programs

Full accounting for verifiable outsourcing

Symphony: Orchestrating sparse and dense tensors with hierarchical heterogeneous processing

T4: Compiling sequential code for effective speculative parallelization in hardware

HELIX-UP: Relaxing program semantics to unleash parallelization

Inter-thread communication in multithreaded, reconfigurable coarse-grain arrays

Predicting new workload or CPU performance by analyzing public datasets

Phloem: Automatic acceleration of irregular applications with fine-grain pipeline parallelism

Trireme: Exploration of hierarchical multi-level parallelism for hardware acceleration

CARAT CAKE: Replacing paging via compiler/kernel cooperation

Cooperative caching for GPUs