Soft focus on rows of server racks overlaid with colourful computer code

NVMe: Redundancy and RAID

We’ve discussed how NVMe opens several technical opportunities as well as challenges for today’s data centers. In theory, implementing NVMe unlocks the storage device from the hardware controller and performs far and above what is possible with SATA and SAS.

Aside from performance considerations, one of the biggest concerns for data center managers is redundancy. While NVMe storage can be attached to traditional hardware controllers, a more efficient approach to redundancy will be via a software-defined storage (SDS) platform

When an organisation switches to NVMe, it’s going to have to explore how it is still going to meet its high-availability practices. This is particularly true for an organisation that has very high SLAs.

Hardware-based RAID controller manufacturers will need to adapt to the emergence of NVMe and offer solutions to connect to existing U.2 server backplanes to support hardware-based NVMe RAID solutions. There are already a few RAID controller cards on the market that support NVMe, but the market is still new. With the HW -based RAID in fairly early stages of development, when organisations make the switch to NVMe, architectural design decisions will have to be considered as they will need to explore how they will still meet their high-availability practices, whether it is through SW-based HCI solutions like vSAN, Ceph, Linux SW -based RAID or LVM mirroring, and application-based high-availability replication like SQL always-on or Oracle ASM mirroring. One can argue that these SW -based design decisions should still exist with HW -based RAID controllers, since the latter only protects against a single point of failure.

The switch to NVMe requires a comprehensive, full-stack review from IT architects and application owners to ensure redundancy exists on every layer of the stack, from compute to network to storage, to ensure the required SLAs are met. When applications share storage resources, it is critical to implement strong redundancy practices at the storage layer. The custom solution would be dependent on the current underlying architecture.

For example, if the servers are virtualised, moving to a SW HCI solution like vSAN might make sense. vSAN offers granular redundancy at the VM level, and can protect the VMs against up to two complete storage node failures. If the company relies on all-flash arrays as a centralised storage solution, most NVMe all-flash arrays already come with a SW-based RAID implementation, but to further protect against complete array failure, storage high-availability replication might be the key.

Kingston regularly ask customers, "How do you manage your storage?"

Most customers are beginning to test the complexities of a switch from hardware-controlled to software-defined architectures. Some have purchased solutions to test while a select few have had to write their own Linux-based software-defined storage package to manage their data center.

For many companies, the software-defined extending to the RAID controller piece isn’t simply a big deal ― it is a fundamental change. There is much for the company to digest and numerous decisions to be made. Because of this, software-defined storage companies are now coming online.


Related Articles