SiP-ML: high-bandwidth optical network interconnects for machine learning training

M Khani, M Ghobadi, M Alizadeh, Z Zhu… - Proceedings of the …, 2021 - dl.acm.org
This paper proposes optical network interconnects as a key enabler for building high-
bandwidth ML training clusters with strong scaling properties. Our design, called SiP-ML …

Petabit-scale silicon photonic interconnects with integrated kerr frequency combs

A Rizzo, S Daudlin, A Novick, A James… - IEEE Journal of …, 2022 - ieeexplore.ieee.org
Silicon photonics holds significant promise in revolutionizing optical interconnects in data
centers and high performance computers to enable scaling into the Pb/s package escape …

Analog coherent detection for energy efficient intra-data center links at 200 Gbps per wavelength

T Hirokawa, S Pinna, N Hosseinzadeh… - Journal of Lightwave …, 2020 - ieeexplore.ieee.org
As datacenters continue to scale in size, energy efficiency for short reach (<; 2 km) links is a
major factor for networks that may connect hundreds of thousands of servers. We …

A case for intra-rack resource disaggregation in HPC

G Michelogiannakis, B Klenk, B Cook, MY Teh… - ACM Transactions on …, 2022 - dl.acm.org
The expected halt of traditional technology scaling is motivating increased heterogeneity in
high-performance computing (HPC) systems with the emergence of numerous specialized …

Noise in the clouds: Influence of network performance variability on application scalability

D De Sensi, T De Matteis, K Taranov… - Proceedings of the …, 2022 - dl.acm.org
Cloud computing represents an appealing opportunity for cost-effective deployment of HPC
workloads on the best-fitting hardware. However, although cloud and on-premise HPC …

Flexspander: augmenting expander networks in high-performance systems with optical bandwidth steering

MY Teh, Z Wu, K Bergman - Journal of Optical Communications and …, 2020 - opg.optica.org
Communication efficiency is one of the deciding factors in determining many of today's high-
performance computing (HPC) applications. Traditionally, HPC systems have been on static …

PINE: photonic integrated networked energy efficient datacenters (ENLITENED program)

M Glick, NC Abrams, Q Cheng, MY Teh… - Journal of Optical …, 2020 - opg.optica.org
We review the motivation, goals, and achievements of the Photonic Integrated Networked
Energy efficient datacenter (PINE) project, which is part of the Advanced Research Projects …

Performance trade-offs in reconfigurable networks for HPC

MY Teh, Z Wu, M Glick, S Rumley… - Journal of Optical …, 2022 - opg.optica.org
Designing efficient interconnects to support high-bandwidth and low-latency communication
is critical toward realizing high performance computing (HPC) and data center (DC) systems …

Architecture and performance studies of 3D-Hyper-FleX-LION for reconfigurable all-to-all HPC networks

G Liu, R Proietti, M Fariborz, P Fotouhi… - … Conference for High …, 2020 - ieeexplore.ieee.org
While the Fat-Tree network topology represents the dominant state-of-art solution for large-
scale HPC networks, its scalability in terms of power, latency, complexity, and cost is …

Characterization and identification of HPC applications at leadership computing facility

Z Liu, R Lewis, R Kettimuthu, K Harms… - Proceedings of the 34th …, 2020 - dl.acm.org
High Performance Computing (HPC) is an important method for scientific discovery via large-
scale simulation, data analysis, or artificial intelligence. Leadership-class supercomputers …