[HTML][HTML] A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives

B Peccerillo, M Mannino, A Mondelli… - Journal of Systems …, 2022 - Elsevier
In recent years, the limits of the multicore approach emerged in the so-called “dark silicon”
issue and diminishing returns of an ever-increasing core count. Hardware manufacturers …

Navigating bottlenecks and trade-offs in genomic data analysis

B Berger, YW Yu - Nature Reviews Genetics, 2023 - nature.com
Genome sequencing and analysis allow researchers to decode the functional information
hidden in DNA sequences as well as to study cell to cell variation within a cell population …

The building blocks of a brain-inspired computer

JD Kendall, S Kumar - Applied Physics Reviews, 2020 - pubs.aip.org
Computers have undergone tremendous improvements in performance over the last 60
years, but those improvements have significantly slowed down over the last decade, owing …

RcppArmadillo: Accelerating R with high-performance C++ linear algebra

D Eddelbuettel, C Sanderson - Computational statistics & data analysis, 2014 - Elsevier
The R statistical environment and language has demonstrated particular strengths for
interactive development of statistical algorithms, as well as data modelling and visualisation …

Patus: A code generation and autotuning framework for parallel iterative stencil computations on modern microarchitectures

M Christen, O Schenk, H Burkhart - 2011 IEEE International …, 2011 - ieeexplore.ieee.org
Stencil calculations comprise an important class of kernels in many scientific computing
applications ranging from simple PDE solvers to constituent kernels in multigrid methods as …

GPU-accelerated preconditioned iterative linear solvers

R Li, Y Saad - The Journal of Supercomputing, 2013 - Springer
This work is an overview of our preliminary experience in develo** a high-performance
iterative linear solver accelerated by GPU coprocessors. Our goal is to illustrate the …

Autotuning GEMM kernels for the Fermi GPU

J Kurzak, S Tomov, J Dongarra - IEEE Transactions on Parallel …, 2012 - ieeexplore.ieee.org
In recent years, the use of graphics chips has been recognized as a viable way of
accelerating scientific and engineering applications, even more so since the introduction of …

Graph coloring algorithms for multi-core and massively multithreaded architectures

ÜV Çatalyürek, J Feo, AH Gebremedhin… - Parallel Computing, 2012 - Elsevier
We explore the interplay between architectures and algorithm design in the context of
shared-memory platforms and a specific graph problem of central importance in scientific …

[KÖNYV][B] Algorithms Sequential And Parallel: A Unified Approach (Charles River Media Computer Engineering (Hardcover))

R Miller, L Boxer - 2005 - dl.acm.org
Algorithms Sequential And Parallel | Guide books skip to main content ACM Digital Library
home ACM Association for Computing Machinery corporate logo Google, Inc. (search) …

Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster

D Göddeke, D Komatitsch, M Geveler… - Journal of …, 2013 - Elsevier
Power consumption and energy efficiency are becoming critical aspects in the design and
operation of large scale HPC facilities, and it is unanimously recognised that future exascale …