Google 학술 검색

TW Huang, CX Lin, G Guo… - 2019 IEEE International …, 2019 - ieeexplore.ieee.org

In this paper we introduce Cpp-Taskflow, a new C++ tasking library to help developers
quickly write parallel programs using task dependency graphs. Cpp-Taskflow leverages the …

저장 인용 96회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]

[PDF] hal.science

Achieving high performance on supercomputers with a sequential task-based programming model

E Agullo, O Aumage, M Faverge… - … on Parallel and …, 2017 - ieeexplore.ieee.org

The emergence of accelerators as standard computing resources on supercomputers and
the subsequent architectural complexity increase revived the need for high-level parallel …

저장 인용 126회 인용 관련 학술자료 전체 12개의 버전

[Free GPT-4]

[PDF] ieee.org

Cpp-taskflow: A general-purpose parallel task programming system at scale

TW Huang, Y Lin, CX Lin, G Guo… - IEEE Transactions on …, 2020 - ieeexplore.ieee.org

This article introduces Cpp-Taskflow, a high-performance parallel task programming system,
to streamline the building of large and complex parallel applications. Cpp-Taskflow …

저장 인용 30회 인용 관련 학술자료 전체 5개의 버전

[Free GPT-4]

[PDF] arxiv.org

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

X Lacoste, M Faverge, G Bosilca… - … Parallel & Distributed …, 2014 - ieeexplore.ieee.org

The ongoing hardware evolution exhibits an escalation in the number, as well as in the
heterogeneity, of computing resources. The pressure to maintain reasonable levels of …

저장 인용 74회 인용 관련 학술자료 전체 22개의 버전

[Free GPT-4]

[PDF] susu.ru

Parallel programming models for dense linear algebra on heterogeneous systems

J Dongarra, M Abalenkovs, A Abdelfattah… - Supercomputing …, 2015 - superfri.susu.ru

We present a review of the current best practices in parallel programming models for dense
linear algebra (DLA) on heterogeneous architectures. We consider multicore CPUs, stand …

저장 인용 62회 인용 관련 학술자료 전체 16개의 버전 HTML 버전

[Free GPT-4]

[PDF] netlib.org

Porting the PLASMA numerical library to the OpenMP standard

A YarKhan, J Kurzak, P Luszczek… - International Journal of …, 2017 - Springer

PLASMA is a numerical library intended as a successor to LAPACK for solving problems in
dense linear algebra on multicore processors. PLASMA relies on the QUARK scheduler for …

저장 인용 51회 인용 관련 학술자료 전체 10개의 버전

[Free GPT-4]

[PDF] osti.gov

Improving performance of GMRES by reducing communication and pipelining global collectives

I Yamazaki, M Hoemmen, P Luszczek… - 2017 IEEE …, 2017 - ieeexplore.ieee.org

We compare the performance of pipelined and s-step GMRES, respectively referred to as l-
GMRES and s-GMRES, on distributed multicore CPUs. Compared to standard GMRES, s …

저장 인용 36회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]

[PDF] hal.science

[PDF][PDF] On runtime systems for task-based programming on heterogeneous platforms

S Thibault - 2018 - inria.hal.science

SIMULATION has become pervasive in science. Real experimentation remains an essential
step in scientific research, but simulation replaced a wide range of costly and lengthy or …

저장 인용 30회 인용 관련 학술자료 전체 4개의 버전 HTML 버전

[Free GPT-4]

[PDF] wiley.com Full View

HPC Programming on Intel Many‐Integrated‐Core Hardware with MAGMA Port to Xeon Phi

J Dongarra, M Gates, A Haidar, Y Jia… - Scientific …, 2015 - Wiley Online Library

This paper presents the design and implementation of several fundamental dense linear
algebra (DLA) algorithms for multicore with Intel Xeon Phi coprocessors. In particular, we …

저장 인용 43회 인용 관련 학술자료 전체 16개의 버전

[Free GPT-4]

[PDF] utk.edu

Unified development for mixed multi-gpu and multi-coprocessor environments using a lightweight runtime environment

A Haidar, C Cao, A Yarkhan, P Luszczek… - 2014 IEEE 28th …, 2014 - ieeexplore.ieee.org

Many of the heterogeneous resources available to modern computers are designed for
different workloads. In order to efficiently use GPU resources, the workload must have a …

저장 인용 40회 인용 관련 학술자료 전체 10개의 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Dynamic task execution on shared and distributed memory architectures

Cpp-Taskflow: Fast task-based parallel programming using modern C++

Achieving high performance on supercomputers with a sequential task-based programming model

Cpp-taskflow: A general-purpose parallel task programming system at scale

Taking advantage of hybrid systems for sparse direct solvers via task-based runtimes

Parallel programming models for dense linear algebra on heterogeneous systems

Porting the PLASMA numerical library to the OpenMP standard

Improving performance of GMRES by reducing communication and pipelining global collectives

[PDF][PDF] On runtime systems for task-based programming on heterogeneous platforms

HPC Programming on Intel Many‐Integrated‐Core Hardware with MAGMA Port to Xeon Phi

Unified development for mixed multi-gpu and multi-coprocessor environments using a lightweight runtime environment