Google Acadèmic

M Diener, EHM Cruz, MAZ Alves, POA Navaux… - ACM Computing …, 2016 - dl.acm.org

Shared memory architectures have recently experienced a large increase in thread-level
parallelism, leading to complex memory hierarchies with multiple cache memory levels and …

Desa Cita Citat per 54 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] rtas.org

Multi-objective co-optimization of FlexRay-based distributed control systems

D Roy, L Zhang, W Chang, D Goswami… - 2016 IEEE Real …, 2016 - ieeexplore.ieee.org

Recently, research on control and architecture co-design has been drawing increasingly
more attention. This is because these techniques integrate the design of the controllers and …

Desa Cita Citat per 60 Articles relacionats Totes les 9 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] ieee.org

DeLoc: a locality and memory-congestion-aware task map** method for modern NUMA systems

M Agung, MA Amrizal, R Egawa, H Takizawa - IEEE Access, 2020 - ieeexplore.ieee.org

The map** of tasks to processor cores, called task map**, is crucial to achieving
scalable performance on multicore processors. On modern NUMA (non-uniform memory …

Desa Cita Citat per 9 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] hal.science

Process affinity, metrics and impact on performance: An empirical study

C Bordage, E Jeannot - … Symposium on Cluster, Cloud and Grid …, 2018 - ieeexplore.ieee.org

Process placement, also called topology map**, is a well-known strategy to improve
parallel program execution by reducing the communication cost between processes. It …

Desa Cita Citat per 11 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] mdpi.com

A Low-Level Virtual Machine Just-In-Time Prototype for Running an Energy-Saving Hardware-Aware Map** Algorithm on C/C++ Applications That Use Pthreads

I Știrb, GR Gillich - Energies, 2023 - mdpi.com

Low-Level Virtual Machine (LLVM) compiler infrastructure is a useful tool for building just-in-
time (JIT) compilers, besides its reliable front end represented by a clang compiler and its …

Desa Cita Citat per 1 Articles relacionats Totes les 4 versions Free GPT-4 DeepSeek A la memòria cau

Using NAS Parallel Benchmarks to evaluate HPC performance in clouds

TK Okada, A Goldman… - 2016 IEEE 15th …, 2016 - ieeexplore.ieee.org

Cloud computing is a reality nowadays, however there are few studies trying to understand
what happens in the actual cloud infrastructures for HPC applications. The focus of this study …

Desa Cita Citat per 11 Articles relacionats Totes les 3 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] ssrn.com

Optimizing performance and energy across problem sizes through a search space exploration and machine learning

L Scravaglieri, M Popov, LL Pilla, A Guermouche… - Journal of Parallel and …, 2023 - Elsevier

HPC systems expose configuration options to assist optimization. Configurations such as
parallelism, thread and data map**, or prefetching have been explored but with a limited …

Desa Cita Citat per 3 Articles relacionats Totes les 6 versions Free GPT-4 DeepSeek

NUMA-BTDM: A thread map** algorithm for balanced data locality on NUMA systems

I Ştirb - 2016 17th International Conference on Parallel and …, 2016 - ieeexplore.ieee.org

Optimizing for Non-Uniform Memory Access (NUMA) systems could be considered
inappropriate because hardware architecture aware optimizations are not portable. On the …

Desa Cita Citat per 8 Articles relacionats

[Free GPT-4]
[DeepSeek]

[PDF] researchgate.net

NUMA-BTLP: A static algorithm for thread classification

I Ştirb - 2018 5th International Conference on Control, Decision …, 2018 - ieeexplore.ieee.org

Despite NUMA aware optimizations are often considered not portable, this paper states that
extending a compiler, supporting compilation of parallel APIs, with NUMA-aware …

Desa Cita Citat per 7 Articles relacionats Totes les 2 versions Free GPT-4 DeepSeek

[Free GPT-4]
[DeepSeek]

[PDF] upc.edu

Predicting the soft error vulnerability of parallel applications using machine learning

I Öz, S Arslan - International Journal of Parallel Programming, 2021 - Springer

With the widespread use of the multicore systems having smaller transistor sizes, soft errors
become an important issue for parallel program execution. Fault injection is a prevalent …

Desa Cita Citat per 5 Articles relacionats Totes les 7 versions Free GPT-4 DeepSeek

Crea una alerta

Cita

Cerca avançada

S'ha desat a La meva biblioteca

Locality and balance for communication-aware thread map** in multicore systems

Affinity-based thread and data map** in shared memory systems

Multi-objective co-optimization of FlexRay-based distributed control systems

DeLoc: a locality and memory-congestion-aware task map** method for modern NUMA systems

Process affinity, metrics and impact on performance: An empirical study

A Low-Level Virtual Machine Just-In-Time Prototype for Running an Energy-Saving Hardware-Aware Map** Algorithm on C/C++ Applications That Use Pthreads

Using NAS Parallel Benchmarks to evaluate HPC performance in clouds

Optimizing performance and energy across problem sizes through a search space exploration and machine learning

NUMA-BTDM: A thread map** algorithm for balanced data locality on NUMA systems

NUMA-BTLP: A static algorithm for thread classification

Predicting the soft error vulnerability of parallel applications using machine learning