Memory
SSD
USB Drives
Flash Cards
Wireless
Support
MEMORY SEARCH

Technical Brief

What is R.A.I.S.E.?

Figure 1. SF-2500 Flash Storage Processor Block Diagram [1]

Redundant Array of Independent Silicon Elements (R.A.I.S.E.™) is a complementary technology to the Error Correcting Code (ECC) capabilities of the Flash Storage Processor (FSP) found in the LSI® SandForce® DuraClass™ technology component.

NAND Flash suffers from a number of naturally occurring bit errors (BE) during use. During the Beginning of Life (BOL) and End of Life (EOL) of the NAND Flash, these bit errors are then detected and corrected by the embedded Error Correcting Code (ECC) component.

Figure 2. An example of exponential NAND BER growth

The Bit Error Rate (BER) is characterised by the NAND Flash manufacturer during production and largely dependent on the fabrication process and type of NAND produced.

The BER is inversely proportional to the program and erase cycles remaining on the NAND; consequently as the NAND Flash device is written or erased more frequently, the Bit Error Rate will proportionately increase toward the NAND EOL.

As shown in Figure 2, the uncorrected Raw Bit Error Rate (RBER) frequency grows exponentially as NAND Flash is programmed (written) or erased from the beginning to the end of its lifecycle leading ultimately to an unusable state post the manufacturer characterised P/E cycle endurance.

In the rare event that a bit error does occur for a piece of data, the first line of defence is the ECC component.

ECC complexity can vary depending on the bit length recoverability (e.g., 1 bit, 2 bits ... 55bits per 512 bytes), code used (e.g., BCH, Reed Solomon) and assists in fixing Flash errors and returning valid data to the host computer.

To characterise the strength of the ECC component the term Uncorrectable Bit Error Rate (UBER) is used to describe the rate at which a single uncorrectable bit error will occur even after ECC is applied.


Figure 3. LSI SandForce FSP versus Standard SSD Controller UEBER [2]

In Figure 3, typically a Uncorrectable Bit Error Ratio of 1 bit error for every 1 quadrillion bits (~0.11 Petabytes) processed occurs on a standard SSD Controller (Flash Storage Processor) and exposes user data to an increased risk of uncorrectable bit errors and silent errors quite early in life in comparison to the SandForce SSD Processor (FSP). [2] [3]

Once the BER exhausts the ECC capabilities on the Flash Storage Processor, especially at the NAND Flash end of life, the probability of an uncorrectable error occurring increases and data corruption may be imminent.

In this circumstance, the second line of is a small amount of NAND Flash reserved from the SSD drive capacity for implementing Redundant Array of Independent Silicon Elements (R.A.I.S.E.) protection.

Figure 4. A single bad page is rebuilt to a new known good block from redundant information [2] [4]

R.A.I.S.E. is constructed from redundant information stored in multiple pages on the SSDs NAND Flash devices to rebuild page or block level data transparently to a known good NAND Flash block as illustrated in Figure 4.

This technology provides the protection and reliability of RAID 5 (Redundant Array Of Independent Disks) on a single SSD drive without twice the write overhead of parity and an Uncorrectable Bit Error Rate (UBER) of nearly one quadrillion times less than a standard SSD Flash Storage Processor without R.A.I.S.E. ™ or 1 bit error for every 100 octillion bits (10^-29) or ~111022302462515.66 Petabytes of data processed.

Page and block level recovery (single bit per stripe) can take place in 50–100ms and has no user-perceivable impact, allowing a seamless error recovery process with guaranteed data integrity.

With each new generation of lithography shrinks, the complexity of managing smaller NAND Flash geometries increases and Program/Erase endurance decreases. Accordingly, R.A.I.S.E. protection has become the recommended solution by NAND Flash manufacturers to manage and enhance NAND Flash reliability.

Figure 5. NAND data protection layers using ECC, R.A.I.S.E. and CRC-32

In circumstances where silent errors may occur due to non-detection of an uncorrectable bit error by the ECC component, invalid data may be returned to the host computer and risk compromising user data integrity.

Since no error was detected by the FSP ECC component, R.A.I.S.E. is unable to assist and the End-to-End 32 bit CRC check is used to catch in flight data before data integrity is compromised by returning invalid data back to the host as valid.

In mission-critical applications like stock trading, the risk of introducing a single bit of corrupted data as valid to the host computer could destroy entire economies if the error isn’t caught immediately.

Conclusion

NAND Flash management complexity increases exponentially from beginning to end of life.

Managing the increasing Bit Error Rate (BER) requires innovative solutions such as the LSI SandForce R.A.I.S.E. to guarantee data protection beyond ECC for NAND Flash device finite program and erase endurance.

Using anything less than R.A.I.S.E. to complement an already complex error correcting system (ECC) and LSI SandForce DuraClass Flash management technology would risk the integrity of not only the user data but also entire SSD in client, enterprise and industrial application classes during the SSD life cycle.

References