Storage – Erasure Coding and RAID – A Few Good Links

Erasure coding has been around for a little while now, and if you’ve ever sat through a presentation from a cloud storage provider talking about resiliency of data at scale, you may have heard it mentioned. It occurred to me that I’ve just assumed that people know what it is, and that’s not fair. I was going to do a post explaining what it is, but figured a quick post with some links to some articles I found of use would be more useful. Because what’s the point of the internet if I can’t be lazy and link to things on it?

Here’re some useful research papers to start with:

The “press” also has some useful articles on the topic. I recommend you have a look at these two:

Some of my preferred analysts have written a bit on the topic:

Josh has also done a great deep-dive on the Nutanix version of erasure coding (EC-X) that you can see here.

My favourite post, though, is this one: Dummies Guide to Erasure Coding.



  1. A technology that seems to be not that well known in my circles is the idea of CRUSH algorithm. This was implemented by LS/NetApp and ended up on arrays like the IBM DS35XX MD32XX MD36XX SUN 2500 M2 as Dynamic Disk Pools. I’m not sure who use use it but there one ones i’ve had real world experience with.

    Actually a pretty cool piece of technology rebuild times on large disks are much better than traditional RAID6.

