Other Topics‎ > ‎

Important High-Level ECC Design Considerations

Reliability and Redundancy

This article discusses reliability and the use of redundancy to increase reliability.

Without redundancy, many memory technologies such as MLC NAND Flash would be too unreliable to be useful or practical, but, with redundancy, they can be made arbitrarily reliable by adding more and more redundancy to the data. Generally speaking, the more redundancy that is added, the more reliable the memory is.

Consider MLC NAND Flash as an example. Without redundancy, MLC NAND Flash would probably be too unreliable to be practical. The more bits stored per cell, the more unreliable MLC NAND Flash memories become. However, it is usually possible to add, maybe 10% redundancy to enable the capacity of MLC NAND Flash memory to be doubled. Adding another 10% redundancy may enable the capacity to be doubled again, etc.

2D-RS Redundancy

Some may argue that the 2D ECC Schemes proposed by ECC Tek take too much redundancy to be practical.

However, it is important to understand that if the basic memory element (either a solid-state memory chip or an HDA) used in a 2D-RS SSD or 2D-RS HDD is used as a “stand-alone device”, then it must be much more reliable than if it is used as a memory element in an array of memory elements. A stand-alone HDD product, for example, would be too unreliable to be practical if, on the average, it failed to recover data a few times a month. However, if the HDA from that HDD were used as a memory element in a 2D-RS HDD, it would probably be reliable enough because the horizontal ECC could easily correct for those types of infrequently occurring failures.

Most likely nobody knows precisely how much the capacity of a storage system could be increased by the use of the 2D-RS scheme, but adding say 25-30% redundancy may allow the usable memory capacity to be significantly increased from what it would be if each memory element was used as a stand-alone product. If that proved to be true, then the extra redundancy added by the 2D-RS ECC scheme would essentially cost nothing and could possibly even decrease the $/TB cost of the memory system.

Capacity and Redundancy

The following graph is intended to convey a concept and is not intended to be precise. Consider a memory that can operate at three operating points A, B and C.

If the memory is operating at operating point A, there normally is some way to increase the total capacity of the memory so that it can operate at point B. Total capacity can be increased by decreasing the size of memory cells or by storing multiple bits per cell. Increasing the total capacity normally increases the error rate (decreases reliability) of the memory so that more redundancy is needed to meet the error rate (reliability) requirement, and the usable capacity is the total capacity minus the redundancy.

We could say that increasing the usable capacity of a memory requires more redundancy, but we could also say that if we increase the redundancy, there probably is some way to increase the usable capacity as long as we are not at the point where the total capacity has reached a limit.