XcalableACC: Extension of XcalableMP PGAS language using OpenACC for accelerator clusters

M Nakao, H Murai, T Shimosaka… - 2014 first workshop …, 2014 - ieeexplore.ieee.org
The present paper introduces the XcalableACC (XACC) programming model, which is a
hybrid model of the XcalableMP (XMP) Partitioned Global Address Space (PGAS) language …

High-performance spectral element methods on field-programmable gate arrays: implementation, evaluation, and future projection

M Karp, A Podobas, N Jansson, T Kenter… - 2021 IEEE …, 2021 - ieeexplore.ieee.org
Improvements in computer systems have historically relied on two well-known observations:
Moore's law and Dennard's scaling. Today, both these observations are ending, forcing …

A cost/power efficient storage system with directly connected FPGA and SATA disks

R Niwase, H Harasawa, Y Yamaguchi… - 2023 IEEE 16th …, 2023 - ieeexplore.ieee.org
Providing large storage on Multi-Access Edge (MEC) devices has various advantages: A
large amount of bare data that cannot transfer to the cloud without anonymization can be …

Off-loading let generation to peach2: A switching hub for high performance gpu clusters

C Tsuruta, Y Miki, T Kuhara, H Amano… - ACM SIGARCH …, 2016 - dl.acm.org
A hardware local essential tree (LET) generator used in an N-body simulation is
implemented on the FPGA of PEACH2 (PCI Express Adaptive Communication Hub ver2), a …

Reduction calculator in an FPGA based switching Hub for high performance clusters

T Kuhara, C Tsuruta, T Hanawa… - 2015 25th International …, 2015 - ieeexplore.ieee.org
Unused logic in the field-programmable gate array (FPGA) for the switching hub is one
potential resource to accelerate the computation of data exchanged through the hub …

Thorough analysis of PCIe Gen3 communication

H Nakamura, H Takayama… - 2017 International …, 2017 - ieeexplore.ieee.org
This article tries a thorough analysis from the physical layer to the transaction layer on PCIe
Gen3 communication by using FPGAs. First, this article shows the performance variation of …

Evaluation of XcalableACC with tightly coupled accelerators/InfiniBand hybrid communication on accelerated cluster

M Nakao, T Odajima, H Murai… - … Journal of High …, 2019 - journals.sagepub.com
Accelerated clusters, which are cluster systems equipped with accelerators, are one of the
most common systems in parallel computing. In order to exploit the performance of such …

Implementation of CG method on GPU cluster with proprietary interconnect TCA for GPU direct communication

K Matsumoto, T Hanawa, Y Kodama… - 2015 IEEE …, 2015 - ieeexplore.ieee.org
We have been develo** a proprietary interconnect technology called Tightly Coupled
Accelerators (TCA) architecture to improve communication latency and bandwidth between …

A preliminarily evaluation of PEACH3: a switching hub for tightly coupled accelerators

T Kuhara, T Kaneda, T Hanawa… - … on computing and …, 2014 - ieeexplore.ieee.org
Tightly coupled accelerators (TCA) architecture consists of heterogeneous nodes connected
to a low-latency high-bandwidth network which allows accelerators to communicate directly …

FPGA を用いた疎行列数値計算の性能評価

大島聡史, 塙敏博, 片桐孝洋… - 研究報告ハイパフォーマンス …, 2016 - ipsj.ixsq.nii.ac.jp
論文抄録 **年, FPGA (Field Programmable Gate Array) に対して新たな高性能計算ハードウェア
として注目が集まっている. FPGA は対象とする処理に合わせた最適な回路構成を用いることで高い …