Business data processing needs have grown substantially every year from terabytes to petabytes and often beyond reach of the traditional server model into the realm of scalable distributed computing for big data analysis catering to industries including healthcare, bioscience, financial, and oil and gas.
A 2003 University of California, Berkeley report stated that, “In 2002 telephone calls worldwide on both landlines and mobile phones contained 17.3 exabytes of new information if stored in digital form,” and that “It would take 9.25 exabytes of storage to hold all U.S. [telephone] calls each year.”1 (An exabyte is equal to one quintillion bytes or one million gigabytes or 1,000 petabytes). A later estimate by the University of Southern California for data stored by 2007 was anticipated to reach 295 exabytes causing great manageability concerns, worldwide.2
When the data processing capability of a company using traditional storage solutions grows beyond feasibility of the current processing practices of these gargantuan datasets, a distributed computing solution utilising a high performance, standardised, easily deployable and reliable data analysis tool such as Apache Hadoop can prove invaluable in providing accurate and timely results on big data scenarios.
Unfortunately even an innovative solution such as Apache Hadoop cannot achieve its true potential using only traditional rotating platter-based mechanical hard disk drives as they are simply not capable of offering the sustained high bandwidth and low latency required in today’s highly evolved and increased capacity big data infrastructure.
Fast, resilient and cost effective NAND Flash based storage solutions present an amazing opportunity to reach new levels in performance, reliability and reduced total cost of ownership (TCO) for these big data structures in standalone configurations or when combined with traditional hard disk drives in archive and transactional tier storage models and to eliminate storage sprawl.
Achievements in reaching 1.8 Million input/output operations per second (IOPS) utilising commercially-available off the shelf (COTS) components have shown that allowing big data analysis tools to reach their true potential does not have to be costly or complex. For example, Kingston Technology’s server-grade solid-state disk (SSD) drives can benefit companies currently utilising hard disk drives in the bioscience and healthcare industries in a variety of ways, including:3
- Accelerating the management and access time of client records in enterprise class databases country-wide to service customer needs faster. As populations grow every year so do their healthcare needs and as a direct result, so do server performance requirements. In a 2010 Accenture study it was estimated that "North America will experience 9.7 percent growth in its electronic medical record (EMR) market — from $7.4 billion in 2010 to $9.8 billion in 2013. With 5,800 hospitals, EMR adoption is beginning to accelerate due to ARRA (American Recovery and Reinvestment Act of 2009) incentives and penalties."4
- Reduced time to analyze relationships from electronic health records (EHRs) and EMRs to predict, analyze the spread of diseases between population groups and turn raw data into valuable information faster using systems such as IBM’s Watson.5
- Increased enterprise resource planning (ERP) performance for asset tracking and management of resources thereby increasing operational performance and allowing hospitals and mobile medical services to improve efficiency servicing patient needs.
- New medicine research, development costs and time to market increase proportionally to the complexity of an illness. Reducing time to market and overall costs by accurately modeling and simulating new biomedical devices or treatments for illnesses can be vital in saving lives and encouraging pharmaceutical companies to invest resources into their development.6
While there is no doubt that 1.8 Million IOPS can benefit these industries tremendously, an SSD’s low access latency and high volume processing performance also lends itself naturally well to online transaction processing (OLTP) operations where the main objectives are to increase the number of transactions that can be solved per second and reduce the response time on a query from the transactional database.
The nature of OLTP applications ranging from a travel agent e-booking system to a bank’s automated teller machine (ATM) fits perfectly to the benefits of SSD architecture incorporated in either a tiered storage solution or a complete SSD storage solution to facilitate the necessities typical of a live transactional database controlling fundamental business tasks with little delay and minimising risks of transactional 'blackouts.'