Performance and power analysis of high-density multi-GPGPU architectures: A preliminary case study

Y Gao, S Iqbal, P Zhang, M Qiu - 2015 IEEE 17th International …, 2015 - ieeexplore.ieee.org
A system architecture with high-density general purpose graphic processing unit (GPGPU) is
emerging as a promising solution that can offer high compute performance and performance …

Optimal circulant graphs as low-latency network topologies

X Huang, A F. Ramos, Y Deng - The Journal of Supercomputing, 2022 - Springer
Communication latency has become one of the determining factors for the performance of
parallel clusters. To design low-latency network topologies for high-performance computing …

A multiple time step** algorithm for efficient multiscale modeling of platelets flowing in blood plasma

P Zhang, N Zhang, Y Deng, D Bluestein - Journal of computational physics, 2015 - Elsevier
We developed a multiple time-step** (MTS) algorithm for multiscale modeling of the
dynamics of platelets flowing in viscous blood plasma. This MTS algorithm improves …

Matrix multiplication on high-density multi-GPU architectures: theoretical and experimental investigations

P Zhang, Y Gao - … Computing: 30th International Conference, ISC High …, 2015 - Springer
Matrix multiplication (MM) is one of the core problems in the high performance computing
domain and its efficiency impacts performances of almost all matrix problems. The high …

A phenomenological particle‐based platelet model for simulating filopodia formation during early activation

S Pothapragada, P Zhang, J Sheriff… - … journal for numerical …, 2015 - Wiley Online Library
We developed a phenomenological three‐dimensional platelet model to characterize the
filopodia formation observed during early stage platelet activation. Departing from …

Optimal low-latency network topologies for cluster performance enhancement

Y Deng, M Guo, AF Ramos, X Huang, Z Xu… - The Journal of …, 2020 - Springer
We propose that clusters interconnected with network topologies having minimal mean path
length will increase their processing speeds. We approach our heuristic by constructing …

QuantCloud: big data infrastructure for quantitative finance on the cloud

P Zhang, K Yu, JY Jessica… - IEEE Transactions on Big …, 2017 - ieeexplore.ieee.org
In this paper, we present the QuantCloud infrastructure, designed for performing big data
analytics in modern quantitative finance. Through analyzing market observations …

[PDF][PDF] Symmetric and folded tori connected torus network

MMH Rahman, Y Inoguchi, F Al Faisal… - Journal of …, 2011 - researchgate.net
Hierarchical interconnection networks provide high performance at low cost by exploring the
locality that exists in the communication patterns of massively parallel computers. A …

Reducing static energy in supercomputer interconnection networks using topology-aware partitioning

J Chen, Y Tang, Y Dong, J Xue… - IEEE Transactions on …, 2015 - ieeexplore.ieee.org
The key to reducing static energy in supercomputers is switching off their unused
components. Routers are the major components of a supercomputer. Whether routers can …

Design and architecture of dell acceleration appliances for database (DAAD): A practical approach with high availability guaranteed

K Yu, Y Gao, P Zhang, M Qiu - 2015 IEEE 17th international …, 2015 - ieeexplore.ieee.org
As IT organizations are pursuing database High Availability (HA) solutions to ensure and
protest critical commercial data, the challenge is to leverage the three-fold key dimensions …