cuSZp2: A GPU Lossy Compressor with Extreme Throughput and Optimized Compression Ratio

Y Huang, S Di, G Li, F Cappello - … : International Conference for …, 2024 - ieeexplore.ieee.org
Existing GPU lossy compressors suffer from expensive data movement overheads,
inefficient memory access patterns, and high synchronization latency, resulting in limited …

CUSZ-i: High-Ratio Scientific Lossy Compression on GPUs with Optimized Multi-Level Interpolation

J Liu, J Tian, S Wu, S Di, B Zhang… - … Conference for High …, 2024 - ieeexplore.ieee.org
Error-bounded lossy compression is a critical technique for significantly reducing scientific
data volumes. Compared to CPU-based compressors, GPU-based compressors exhibit …

gzccl: Compression-accelerated collective communication framework for gpu clusters

J Huang, S Di, X Yu, Y Zhai, J Liu, Y Huang… - Proceedings of the 38th …, 2024 - dl.acm.org
GPU-aware collective communication has become a major bottleneck for modern computing
platforms as GPU computing power rapidly rises. A traditional approach is to directly …

Ceresz: Enabling and scaling error-bounded lossy compression on cerebras cs-2

S Song, Y Huang, P Jiang, X Yu, W Zheng… - Proceedings of the 33rd …, 2024 - dl.acm.org
Today's scientific applications running on supercomputers produce large volumes of data,
leading to critical data storage and communication challenges. To tackle the challenges …

Hoszp: An efficient homomorphic error-bounded lossy compressor for scientific data

T Agarwal, S Di, J Huang, Y Huang… - arxiv preprint arxiv …, 2024 - arxiv.org
Error-bounded lossy compression has been a critical technique to significantly reduce the
sheer amounts of simulation datasets for high-performance computing (HPC) scientific …

Overcoming memory constraints in quantum circuit simulation with a high-fidelity compression framework

B Zhang, B Fang, F Ye, Y Gu, N Tallent, G Tan… - arxiv preprint arxiv …, 2024 - arxiv.org
Full-state quantum circuit simulation requires exponentially increased memory size to store
the state vector as the number of qubits scales, presenting significant limitations in classical …

hZCCL: Accelerating Collective Communication with Co-Designed Homomorphic Compression

J Huang, S Di, X Yu, Y Zhai, J Liu, Z Jian… - … Conference for High …, 2024 - ieeexplore.ieee.org
As network bandwidth struggles to keep up with rapidly growing computing capabilities, the
efficiency of collective communication has become a critical challenge for exa-scale …

POSTER: Optimizing Collective Communications with Error-bounded Lossy Compression for GPU Clusters

J Huang, S Di, X Yu, Y Zhai, J Liu, Y Huang… - Proceedings of the 29th …, 2024 - dl.acm.org
GPU-aware collective communication has become a major bottleneck for modern computing
platforms as GPU computing power rapidly rises. To address this issue, traditional …

A Survey on Error-Bounded Lossy Compression for Scientific Datasets

S Di, J Liu, K Zhao, X Liang, R Underwood… - arxiv preprint arxiv …, 2024 - arxiv.org
Error-bounded lossy compression has been effective in significantly reducing the data
storage/transfer burden while preserving the reconstructed data fidelity very well. Many error …

A Portable, Fast, DCT-based Compressor for AI Accelerators

M Shah, X Yu, S Di, M Becchi, F Cappello - Proceedings of the 33rd …, 2024 - dl.acm.org
Lossy compression can be an effective tool in AI training and inference to reduce memory
requirements, storage footprint, and in some cases, execution time. With the rise of novel …