2D-RS Schemes‎ > ‎

Understanding RS RAID

This article explains some often misunderstood concepts about RAID that need to be correctly understood in order to make good judgements about which type of RAID scheme is best.

Some companies are now advertising that they are using Reed-Solomon (RS) codes to provide fault-tolerance in their storage systems, but there are fundamental problems with the way in which they are implementing RS codes as explained on this web page.

Problems When there are No Errors or Component Failures

For RS codes (and every other linear block code), the redundant check values are mathematical functions of the message values. The words “message” and “data” as used on this web page are equivalent. The encoding function multiplies the variable message values by an encoding matrix of constant values as shown below to compute the redundant check values.

The redundant check values c0, c1, c2, and c3 are sums of products. The message values are multiplied by constants from the encoding matrix and summed to form the check values. The c’s shown above in red are all constants, but not the same constant. 

Many companies are loosely coupling a number of records together in a group so that RS encoding is done “across” those records, but those records can be accessed independently. (This type of RAID scheme is explained in more depth here.) Encoded message values that span across a number of records form “horizontal” or "lateral" RS codewords. Message and check values are small m-bit quantities - often “bytes” - as illustrated below.

With that scheme, records from individual devices can be read independently. Many reading operations can occur simultaneously when everything is operating without error.

Writing with that scheme is inefficient because all of the redundant check values in different devices must be updated to take into consideration new data values. (Again, this is explained in more depth here.)

In addition, if reconstruction of the data due to a previous device failure is taking place, writing to that set of records must be stopped until reconstruction is finished. 

Consider the following single check equation where the red numbers are constants from an encoding matrix, the black numbers are message values from several independent records and the blue number is one redundant check value.

109 = 2 * 7 + 4 * 8 + 5 * 6 + 5 * 5 + 2 * 4 

Suppose we want to change the first message value from 2 to 9.

Subtract the old check equation from the new check equation as shown below...

To update this one check value for a new message value, the old message value and old check value must first be read, the new check value computed and the new message value and check value written.

There are usually thousands of bytes in each record, so the above computation must be done thousands of times for each record and that process must be repeated R times where R is the number of redundant bytes in each horizontal codeword.  This method becomes impractical when R > 2 and is highly inefficient even when R = 2.

It is important to understand that the above-mentioned mathematical relationship between the message and check values must always be maintained in order to maintain fault tolerance.

When writing new data values, some companies are recomputing check values over a potentially very long period of time - maybe microseconds, milliseconds, seconds, minutes, hours or even days - it all depends upon how busy the operating system is, the current state of the storage system and what priority recomputing check value tasks have compared to other tasks in a multitasking operating system. During that recomputing time, fault-tolerance is temporarily lost, and, if anything goes wrong during that time period, fault-tolerance is permanently lost.

ECC Tek’s Proposal

ECC Tek is proposing a RS RAID system whereby the relationship between the message and check values is continuously maintained because all of the check values are always computed simultaneously in one tick of a very high frequency clock in hardware as illustrated below.