- Academic Search

Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture

S Usman, R Mehmood, I Katib, A Albeshri - Electronics, 2022 - mdpi.com

Big data has revolutionized science and technology leading to the transformation of our
societies. High-performance computing (HPC) provides the necessary computational power …

Save Cite Cited by 21 Related articles All 5 versions Free GPT-4 Cached

Improving the efficiency of graph algorithm executions on high‐performance computing

MK Moori, HMG de A. Rocha… - Concurrency and …, 2023 - Wiley Online Library

The growing need for extracting information from large graphs has been pushing the
development of parallel graph algorithms. However, the highly irregular structure of the real …

Save Cite Cited by 8 Related articles

[Free GPT-4]

[PDF] hal.science

Exposing the locality of heterogeneous memory architectures to HPC applications

B Goglin - Proceedings of the Second International Symposium …, 2016 - dl.acm.org

High-performance computing requires a deep knowledge of the hardware platform to fully
exploit its computing power. The performance of data transfer between cores and memory is …

Save Cite Cited by 32 Related articles All 6 versions Free GPT-4

Smart resource allocation of concurrent execution of parallel applications

VS da Silva, AGD Nogueira, EC de Lima… - Concurrency and …, 2023 - Wiley Online Library

Thread‐level parallelism (TLP) has been widely exploited to optimize computational
resource usage in high‐performance systems. However, as many applications do not scale …

Save Cite Cited by 6 Related articles All 3 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

Towards the Structural Modeling of the Topology of next-generation heterogeneous cluster Nodes with hwloc

B Goglin - 2016 - inria.hal.science

Parallel computing platforms are increasingly complex, with multiple cores, shared caches,
and NUMA memory interconnects, as well as asymmetric I/O access. Upcoming …

Save Cite Cited by 10 Related articles All 2 versions Free GPT-4 View as HTML

[Free GPT-4]

[PDF] udc.es

UPCBLAS: a library for parallel matrix computations in Unified Parallel C

J González‐Domínguez, MJ Martín… - Concurrency and …, 2012 - Wiley Online Library

SUMMARY The popularity of Partitioned Global Address Space (PGAS) languages has
increased during the last years thanks to their high programmability and performance …

Save Cite Cited by 14 Related articles All 17 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

On the overhead of topology discovery for locality-aware scheduling in HPC

B Goglin - 2017 25th Euromicro International Conference on …, 2017 - ieeexplore.ieee.org

The increasing complexity of parallel computing platforms requires a deep knowledge of the
hardware and of the application needs. Locality a key criteria for performance optimization. It …

Save Cite Cited by 7 Related articles All 5 versions Free GPT-4

Analyzing the energy efficiency of the memory subsystem in multicore processors

S Catalan, JG Dominguez, R Mayo… - 2014 IEEE International …, 2014 - ieeexplore.ieee.org

In this paper we analyze the energy overhead incurred when operating with data stored in
different levels of the memory subsystem (cache levels and DDR chips) of current multicore …

Save Cite Cited by 5 Related articles All 5 versions Free GPT-4

[Free GPT-4]

[PDF] hal.science

Solving dense linear systems on accelerated multicore architectures

A Rémy - 2015 - theses.hal.science

In this PhD thesis, we study algorithms and implementations to accelerate the solution of
dense linear systems by using hybrid architectures with multicore processors and …

Save Cite Cited by 4 Related articles All 5 versions Free GPT-4 Library Search View as HTML

[Free GPT-4]

[PDF] hal.science

Locality optimization on a NUMA architecture for hybrid LU factorization

A Rémy, M Baboulin, M Sosonkina… - … and Engineering (CSE), 2014 - ebooks.iospress.nl

We study the impact of non-uniform memory accesses (NUMA) on the solution of dense
general linear systems using an LU factorization algorithm. In particular we illustrate how an …

Save Cite Cited by 4 Related articles All 13 versions Free GPT-4

Create alert

Cite

Advanced search

Saved to My library

Automatic map** of parallel applications on multicore architectures using the Servet benchmark...

Data locality in high performance computing, big data, and converged systems: An analysis of the cutting edge and a future system architecture

Improving the efficiency of graph algorithm executions on high‐performance computing

Exposing the locality of heterogeneous memory architectures to HPC applications

Smart resource allocation of concurrent execution of parallel applications

Towards the Structural Modeling of the Topology of next-generation heterogeneous cluster Nodes with hwloc

UPCBLAS: a library for parallel matrix computations in Unified Parallel C

On the overhead of topology discovery for locality-aware scheduling in HPC

Analyzing the energy efficiency of the memory subsystem in multicore processors

Solving dense linear systems on accelerated multicore architectures

Locality optimization on a NUMA architecture for hybrid LU factorization