Google 학술 검색

Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architect...

J Dongarra, M Gates, A Haidar, J Kurzak… - ACM Transactions on …, 2019 - dl.acm.org

The recent version of the Parallel Linear Algebra Software for Multicore Architectures
(PLASMA) library is based on tasks with dependencies from the OpenMP standard. The …

저장 인용 73회 인용 관련 학술자료 전체 4개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] netlib.org

Porting the PLASMA numerical library to the OpenMP standard

A YarKhan, J Kurzak, P Luszczek… - International Journal of …, 2017 - Springer

PLASMA is a numerical library intended as a successor to LAPACK for solving problems in
dense linear algebra on multicore processors. PLASMA relies on the QUARK scheduler for …

저장 인용 51회 인용 관련 학술자료 전체 10개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] tennessee.edu

Dynamic task execution on shared and distributed memory architectures

A YarKhan - 2012 - trace.tennessee.edu

Multicore architectures with high core counts have come to dominate the world of high
performance computing, from shared memory machines to the largest distributed memory …

저장 인용 63회 인용 관련 학술자료 전체 5개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] netlib.org

Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting

J Dongarra, M Faverge, H Ltaief… - Concurrency and …, 2014 - Wiley Online Library

The LU factorization is an important numerical algorithm for solving systems of linear
equations in science and engineering and is a characteristic of many dense linear algebra …

저장 인용 57회 인용 관련 학술자료 전체 12개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] psu.edu

Investigating applications portability with the uintah dag-based runtime system on petascale supercomputers

Q Meng, A Humphrey, J Schmidt… - Proceedings of the …, 2013 - dl.acm.org

Present trends in high performance computing present formidable challenges for
applications code using multicore nodes possibly with accelerators and/or co-processors …

저장 인용 55회 인용 관련 학술자료 전체 15개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] netlib.org

An improved parallel singular value algorithm and its implementation for multicore hardware

A Haidar, J Kurzak, P Luszczek - … of the International Conference on High …, 2013 - dl.acm.org

The enormous gap between the high-performance capabilities of today's CPUs and off-chip
communication poses extreme challenges to the development of numerical software that is …

저장 인용 54회 인용 관련 학술자료 전체 10개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] ssslab.cn

Efficient block algorithms for parallel sparse triangular solve

Z Lu, Y Niu, W Liu - Proceedings of the 49th International Conference on …, 2020 - dl.acm.org

The sparse triangular solve (SpTRSV) kernel is an important building block for a number of
linear algebra routines such as sparse direct and iterative solvers. The major challenge of …

저장 인용 24회 인용 관련 학술자료 전체 3개의 버전

LU factorization with partial pivoting for a multicore system with accelerators

J Kurzak, P Luszczek, M Faverge… - IEEE Transactions on …, 2012 - ieeexplore.ieee.org

LU factorization with partial pivoting is a canonical numerical procedure and the main
component of the high performance LINPACK benchmark. This paper presents an …

저장 인용 44회 인용 관련 학술자료 전체 7개의 버전

[Free GPT-4]
[DeepSeek]

[PDF] escholarship.org

[PDF][PDF] Dense linear algebra on distributed heterogeneous hardware with a symbolic dag approach

G Bosilca - 2012 - escholarship.org

While the first two involve fundamental physical limitations that current technology trends are
unlikely to overcome in the near term, the third is an obvious consequence of the first two …

저장 인용 38회 인용 관련 학술자료 전체 18개의 버전 HTML 버전

[Free GPT-4]
[DeepSeek]

[PDF] 131.254.254.45

High performance matrix inversion based on LU factorization for multicore architectures

J Dongarra, M Faverge, H Ltaief… - Proceedings of the 2011 …, 2011 - dl.acm.org

The goal of this paper is to present an efficient implementation of an explicit matrix inversion
of general square matrices on multicore computer architecture. The inversion procedure is …

저장 인용 35회 인용 관련 학술자료 전체 19개의 버전

알림 만들기

인용

고급 검색

라이브러리에 저장됨

Analysis of dynamically scheduled tile algorithms for dense linear algebra on multicore architect...

PLASMA: Parallel linear algebra software for multicore using OpenMP

Porting the PLASMA numerical library to the OpenMP standard

Dynamic task execution on shared and distributed memory architectures

Achieving numerical accuracy and high performance using recursive tile LU factorization with partial pivoting

Investigating applications portability with the uintah dag-based runtime system on petascale supercomputers

An improved parallel singular value algorithm and its implementation for multicore hardware

Efficient block algorithms for parallel sparse triangular solve

LU factorization with partial pivoting for a multicore system with accelerators

[PDF][PDF] Dense linear algebra on distributed heterogeneous hardware with a symbolic dag approach

High performance matrix inversion based on LU factorization for multicore architectures