https://mazzo.li/posts/reed-solomon.html

-

2022-11-06 The essence of Reed-Solomon coding

Let's say we want to store some data on multiple drives, so that we
can recover from drive failures.

One obvious (and valid!) first attempt is to just store everything
multiple times - usually called "mirroring". The most common form of
mirroring is to store things twice: if one drive fails, the data is
still in the other.^1

Reed-Solmon coding gives us much more flexibility, allowing us to
store our data over n = k + t drives, so that any t drives can fail
while still not losing data.^2

For instance, with \mathrm{RS}(10, 4) we can store data over 14
drives, and any 4 drives can fail at once, without incurring data
loss. The data is split over 10 equal blocks, and then 4 additional
blocks ("parity" blocks) are generated in a such a way that any 4
blocks can be recovered from the other 10.^3^4

The main idea behind Reed-Solomon is easy to visualize. We'll focus
on the task of durably storing k numbers y_1, ..., y_k. We'll plot
them as (1, y_1), ..., (k, y_k). For instance, with k = 4, we might
have:

 1. This is often known as RAID1.-[?]

 2. This post presents a specific way to perform Reed-Solomon coding,
    using Lagrange interpolation. See the Wikipedia article for more
    information.-[?]

 3. RAID4 and RAID6 fit in this framework as \mathrm{RS}(3,1) and \
    mathrm{RS}(4,2) respectively.-[?]

 4. The "parity" terminology comes from other error detecting or
    correcting codes which use parity bits.-[?]

[rs-plot-1]

For any distinct k points, there's a unique polynomial of degree d <
k passing through them.^5 This fact should not be too surprising, and
is apparent when k = 2: given two distinct points, there's only one
line passing through them.

Here's the unique third-degree polynomial passing through our four
points:

[rs-plot-2]

 5. A polynomial of degree d is a something of the form

    a_0 + a_1 x + a_2 x^2 + ... + a_d x^d-[?]

Armed with this fact, we can pick an additional t points on the same
polynomial. Again, with k = 4 and t = 2, we would have:

[rs-plot-3]

Sampled points are in gold. Since the interpolating polynomial is
unique given any k points it passes through, we can drop any t
points, and we'll still be able to recover the polynomial. Once we've
done that, we can just resample it to recover the numbers we're
storing.

To recap, our procedure to durably store k numbers is as follows:

  * Compute the unique polynomial of degree d < k by placing the
    numbers at some predefined interval on the XY plane;
  * Sample the polynomial t times beyond the original points;
  * Store the k original numbers alongside the t "parity" numbers;
  * If some numbers are lost we can recompute them by resampling the
    polynomial.

At this point, you already understand a key idea powering
Reed-Solomon coding.

The other key idea allows us to store bits, rather than numbers. I
won't properly explain it here, but the gist of it is to use finite
fields of order 2^{\mathrm{bits}} as our number type.^6

Such finite fields are numeric types working differently from the
usual integers modulo 2^{\mathrm{bits}} that we're used to program
with, but still easy to implement in hardware, and importantly
numbers for which the considerations in this blog post about
polynomials hold.^7

Once we have such a numeric type, all we need to do to durably store
some blob of data is to split it in a series of 2^{\mathrm{bits}}
numbers (each \mathrm{bits} wide), group them in sets of k, and then
store them durably as described in this post, tuning t based on how
redundant we want to be.

The final trick worth mentioning is that this kind of Reed-Solomon
encoding can be implemented efficiently given that we have fixed the 
x coordinates, no matter what numbers we want to store. Or in other
words, the definition for \ell_j which we use to generate the unique
polynomial only depends on the x coordinates, which allows us to do
the heavy lifting once for any k numbers we want to store.

 6. Head over to Peter Cawley's blog for details on how the finite
    field machinery works on modern CPUs.-[?]

 7. As the name suggests, finite fields are fields, which is handy
    since generating polynomials passing through a set of points
    involves divisions.

    Integers modulo 2^\mathrm{bits}, which is what we're used to
    program with, are not closed under division, which jams the maths
    required to perform the operations we described.-[?]

---------------------------------------------------------------------
f@mazzo.li * twitter * source