Erasure coding for distributed storage: An overview

SB Balaji, MN Krishnan, M Vajha, V Ramkumar… - Science China …, 2018 - Springer
In a distributed storage system, code symbols are dispersed across space in nodes or
storage units as opposed to time. In settings such as that of a large data center, an important …

Exploiting combined locality for {Wide-Stripe} erasure coding in distributed storage

Y Hu, L Cheng, Q Yao, PPC Lee, W Wang… - … USENIX Conference on …, 2021 - usenix.org
Erasure coding is a low-cost redundancy mechanism for distributed storage systems by
storing stripes of data and parity chunks. Wide stripes are recently proposed to suppress the …

Clay codes: Moulding {MDS} codes to yield an {MSR} code

M Vajha, V Ramkumar, B Puranik, G Kini… - … USENIX Conference on …, 2018 - usenix.org
With increase in scale, the number of node failures in a data center increases sharply. To
ensure availability of data, failure-tolerance schemes such as Reed-Solomon (RS) or more …

Repair pipelining for erasure-coded storage: Algorithms and evaluation

X Li, Z Yang, J Li, R Li, PPC Lee, Q Huang… - ACM Transactions on …, 2021 - dl.acm.org
We propose repair pipelining, a technique that speeds up the repair performance in general
erasure-coded storage. By carefully scheduling the repair of failed data in small-size units …

{ParaRC}: Embracing {Sub-Packetization} for Repair Parallelization in {MSR-Coded} Storage

X Li, K Cheng, K Tang, PPC Lee, Y Hu, D Feng… - … USENIX Conference on …, 2023 - usenix.org
Minimum-storage regenerating (MSR) codes are provably optimal erasure codes that
minimize the repair bandwidth (ie, the amount of traffic being transferred during a repair …

Rack-aware regenerating codes for data centers

H Hou, PPC Lee, KW Shum… - IEEE Transactions on …, 2019 - ieeexplore.ieee.org
Erasure coding is widely used for massive storage in data centers to achieve high fault
tolerance and low storage redundancy. Since the cross-rack communication cost is often …

Optimal repair layering for erasure-coded data centers: From theory to practice

Y Hu, X Li, M Zhang, PPC Lee, X Zhang… - ACM Transactions on …, 2017 - dl.acm.org
Repair performance in hierarchical data centers is often bottlenecked by cross-rack network
transfer. Recent theoretical results show that the cross-rack repair traffic can be minimized …

Codes for distributed storage

V Ramkumar, M Vajha, SB Balaji… - … of Coding Theory, 2021 - api.taylorfrancis.com
The traditional means of ensuring reliability in data storage is to store multiple copies of the
same file in different storage units. Such a replication strategy is clearly inefficient in terms of …

Boosting {full-node} repair in {erasure-coded} storage

S Lin, G Gong, Z Shen, PPC Lee, J Shu - 2021 USENIX Annual …, 2021 - usenix.org
As a common choice for fault tolerance in today's storage systems, erasure coding is still
hampered by the induced substantial traffic in repair. A variety of erasure codes and repair …

Parallelized in-network aggregation for failure repair in erasure-coded storage systems

J **a, L Luo, B Sun, G Cheng… - IEEE/ACM Transactions …, 2024 - ieeexplore.ieee.org
To repair a failed block in the erasure-coded storage system, multiple related blocks have to
be retrieved from other storage nodes across the network. Such a process can lead to …