[HTML][HTML] A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives

B Peccerillo, M Mannino, A Mondelli… - Journal of Systems …, 2022 - Elsevier
In recent years, the limits of the multicore approach emerged in the so-called “dark silicon”
issue and diminishing returns of an ever-increasing core count. Hardware manufacturers …

MIMD programs execution support on SIMD machines: a holistic survey

D Mustafa, R Alkhasawneh, F Obeidat… - IEEE Access, 2024 - ieeexplore.ieee.org
The Single Instruction Multiple Data (SIMD) architecture, supported by various high-
performance computing platforms, efficiently utilizes data-level parallelism. The SIMD model …

VPIC 2.0: Next generation particle-in-cell simulations

R Bird, N Tan, SV Luedtke, SL Harrell… - … on Parallel and …, 2021 - ieeexplore.ieee.org
VPIC is a general purpose particle-in-cell simulation code for modeling plasma phenomena
such as magnetic reconnection, fusion, solar weather, and laser-plasma interaction in three …

Efficiently running spmv on long vector architectures

C Gómez, F Mantovani, E Focht, M Casas - Proceedings of the 26th ACM …, 2021 - dl.acm.org
Sparse Matrix-Vector multiplication (SpMV) is an essential kernel for parallel numerical
applications. SpMV displays sparse and irregular data accesses, which complicate its …

Performance evaluation of a next-generation SX-Aurora TSUBASA vector supercomputer

K Takahashi, S Fujimoto, S Nagase, Y Isobe… - … Conference on High …, 2023 - Springer
Data movement is a key bottleneck in terms of both performance and energy efficiency in
modern HPC systems. The NEC SX-series supercomputers have a long history of …

Efficient execution of spgemm on long vector architectures

V Le Fèvre, M Casas - … of the 32nd International Symposium on High …, 2023 - dl.acm.org
The Sparse GEneral Matrix-Matrix multiplication (SpGEMM) C= A x B is a fundamental
routine extensively used in domains like machine learning or graph analytics. Despite its …

An external definition of the one-hot constraint and fast QUBO generation for high-performance combinatorial clustering

M Kumagai, K Komatsu, F Takano, T Araki… - International Journal of …, 2021 - jstage.jst.go.jp
Recently, a clustering method using a combinatorial optimization problem, called
combinatorial clustering, has been drawing attention due to the rapid spreads of quantum …

Performance and power analysis of a vector computing system

K Komatsu, A Onodera, E Focht, S Fujimoto… - Supercomputing …, 2021 - superfri.org
The performance of recent computing systems has drastically improved due to the increase
in the number of cores. However, this approach is reaching the limitation due to the power …

Agents of autonomy: A systematic study of robotics on modern hardware

M Bakhshalipour, PB Gibbons - … of the ACM on Measurement and …, 2023 - dl.acm.org
As robots increasingly permeate modern society, it is crucial for the system and hardware
research community to bridge its long-standing gap with robotics. This divide has persisted …

Exploiting the potentials of the second generation SX-Aurora TSUBASA

R Egawa, S Fujimoto, T Yamashita… - 2020 IEEE/ACM …, 2020 - ieeexplore.ieee.org
NEC SX-series vector supercomputers have provided outstanding memory bandwidths to
meet the strong demands for efficient execution of memory-intensive scientific applications …