Blog


I will post my latest ideas here.

LDPC and Binary BCH Combined

Current HDDs contain an inner high-performance Low-Density Parity-Check (LDPC) encoder and decoder and an outer Reed-Solomon (RS) encoder and decoder to reduce channel noise as illustrated below...



HDD manufacturers are not overly concerned about the size of the LDPC encoder and decoder because they are implemented in a chip which contains millions of gates, and the other digital functions needed for a disk drive do not consume a lot of chip area so there are millions of gates available for the LDPC encoder and decoder.

Since it makes sense to have a LDPC encoder and decoder which inputs soft decision information close to the media, one might ask, "Why not do the same thing with MLC NAND Flash chips as is done with HDDs?  That is, put an LDPC encoder and decoder close to the MLC NAND Flash memory array?"  There are some issues involved with doing that as discussed below.

Here's an illustration of a Flash chip...



If a powerful LDPC encoder and decoder such as the ones used with HDDs which might reduce the error rate from something like 10-2 to something like 10-15 were implemented on the Flash chip, the area consumed by the LDPC encoder and decoder would be a considerable fraction of the total area of the chip and would reduce the capacity of the Flash memory array as illustrated below...




Doing in Flash what HDDs do probably would be impractical because too much of the Flash capacity would have to be sacrificed in order to implement the LDPC encoder and decoder.

It makes sense to use a LDPC decoder with channels that provide soft decision information - that is, in channels that output a sequence of probability values for each bit which are usually represented as Log Likelihood Ratios (LLRs)  in the form of 0xxx or 1xxx where the leftmost bit indicates the most-likely value of the bit that was written in that bit position and the xxx value indicates the confidence level.  For example, 1111 would indicate that it was determined that a 1 was transmitted in that bit position and there is a very high level of confidence in that determination.  A 1000 would indicate that it was determined that a 1 was transmitted in that bit position but there is a very low level of confidence in that determination.  Similarly, 0100 would indicate that it was determined that a 0 was written in that bit position with a medium level of confidence.

Since development of powerful LDPC encoders and decoders is extremely complicated and often requires a team of 50 or more people (many with PhDs), information about particular LDPC implementations is currently kept confidential, but most-likely powerful LDPC implementations require millions of gates.

LDPC "codes", in my opinion, would be more accurately described as "signal processing methodologies" which are used to reduce the effect of noise rather than as "error-correcting codes" since they normally operate on sequences of probabilities rather than on sequences of bits.

It seems to me that the best possible way to implement LDPC encoders and decoders with Flash so that the encoders and decoders are "close to the media" would be to constrain the chip area that could be used for LDPC encoders and decoders as illustrated below...




 and then use a binary BCH encoder and decoder in combination with the LDPC encoder and decoder outside of the Flash chip to further reduce the error rate as illustrated below...




The above arrangement would allow a low area LDPC decoder to input soft decision information and also allow a binary BCH code to be used which can be easily implemented very efficiently.  I believe this system would require significantly fewer gates and consume significantly less power than an equivalent large area LDPC encoder and decoder.  In addition, the performance of the binary BCH system can be precisely determined which cannot be done with a single large area LDPC system.  The low area LDPC decoder may reduce the error rate from something like 10-2 to something like 10-5 and the binary BCH decoder would reduce the error rate from something like 10-5 to something like 10-15  or less.

If Flash manufacturers require that their customers implement LDPC encoders and decoders, then those manufacturers will need to design Flash chips so that they output a set of probabilities or log-likelihood ratios instead of a set of bits which will mean a substantial change to the Flash chip interface because m bits will need to be outputted for each past-generation Flash "bit".


LDPC Decoding Process Illustrated

To gain some insight into the LDPC decoding process, one can compare the LDPC decoding process to the legal processes of trying thousands of people each of which have been accused of committing a crime and each of which are either innocent(0) or guilty(1).

Various witnesses would testify to a judge or jury in support of either an innocent or guilty verdict for each person.  The trials would be short if evidence was limited.  The time allowed for deliberations to determine innocence or guilt could also constrained.

Most-likely, even if the evidence is limited and the time allowed for deliberations is short, the determinations regarding guilt and innocence will probably still be fairly good.


RAID-5-like RAID Systems Do Not Always Use Horizontal Redundancy

In RAID systems which use RAID-5-like schemes, the horizontal/lateral redundancy is not used for decoding most of the time. The only time the horizontal redundancy is used for decoding is when a memory device has failed. It is then used to reconstruct the failed data. 

In the types of schemes promoted by ECC Tek (RAID-2 like schemes), the horizontal redundancy can be used all the time by the decoders to reduce error rate. For example, suppose we have 3 memory devices which contain redundancy. That is, . For RS codes, where t is the number of errors and s is the number of erasures. In this scenario, the horizontal redundancy can work cooperatively with the vertical/longitudinal redundancy. If the vertical decoder is unable to correct a severe error pattern, it will, with a high degree of certainty, fail and that failure indicator can be used to declare the symbols in that position to be erased so they can easily be corrected by the horizontal decoder.

This is of interest because LDPC advocates push the idea that LDPC codes are more powerful than binary BCH codes for a fixed amount of redundancy (which may or may not be true, but assume it is true). However, if a vertical binary BCH code is used in combination with a horizontal RS code, the power of the vertical binary BCH code to correct errors will be dramatically increased, and the combination of the binary BCH codes and RS codes most-likely would provide much more protection than a single, vertical LDPC code could provide. 

These ideas could be combined with the memory/storage architecture ideas presented here.