All businesses have a core set of applications and services that are critical to their ongoing operation and growth. They are the lifeblood of a business. Many of these applications and services are run in virtual machines (VM), as over the last decade virtualization has become the de facto standard in the datacenter for the deployment of applications and services. Some applications and services are classified as business critical. These business critical applications require a higher level of resilience and protection to minimize the impact on a business’s operation if they become inoperable.
The ability to quickly recover from an application outage has become imperative in today’s datacenter. There are various methods that offer different levels of protection to maintain application uptime. These methods range from minimizing the downtime at the application level to virtual machine (VM) recovery to physical system recovery. Prior to virtualization, mechanisms were in place to protect physical systems and were based on having secondary hardware and redundant storage systems. However, as noted above, today most systems have been virtualized. The market leader in virtualization, VMware, recognized the importance of availability early on and created business continuity features in vSphere such as vMotion, Storage vMotion, vSphere Replication, vCenter Site Recovery Manager (SRM), vSphere High Availability (HA) and vSphere Fault Tolerance (FT). These features have indeed increased the uptime of applications in the enterprise, yet they are oriented toward protecting the VM. The challenge, as many enterprises have discovered, is that protecting the VM alone does not guarantee uptime for applications and services. Detecting and remediating VM failure falls short of what is truly vital, detecting and remediating application and service failures.
With application and service availability in mind, companies such as Symantec have come in to provide availability and resiliency for them. Focusing on improving how VMware can deliver application availability, Symantec has developed a set of solutions to meet the high availability and disaster recovery requirements of business critical applications. These solutions include Symantec ApplicationHA (developed in partnership with VMware) and Symantec Cluster Server powered by Veritas (VCS). Both of these products have been enhanced to work in a VMware-based virtual infrastructure environment.
New enterprise-grade file systems don’t come around very often. Over the last two decades we have seen very few show up; ZFS was introduced in 2004, Isilon’s OneFS in 2003, Lustre in 2001, and WAFL in 1992. There is a good reason behind this: the creation of a unique enterprise-grade file system is not a trivial undertaking and takes considerable resources and vision. During the last ten years, we have seen seismic changes in the data center and storage industry. Today’s data center runs a far different workload than what was prevalent when these first-generation file systems were developed. For example, today’s data center runs virtual machines, and it is the exception when there is a one-to-one correlation between a server and storage. Databases span the largest single disk drives. A huge amount of data is ingested by big data applications, social media. Data must be kept in order to meet the requirements of government and corporate policy. Technology has dramatically changed over the last decade. For instance, flash memory has become prevalent, commodity x86 processors now rival ASIC chips in power and performance, and new software development and delivery methodologies such as “agile” have become mainstream. In the past, we were concerned with how to deal with the underlying storage, but now we are concerned with how to deal with this huge amount of data that we have stored.
What could be accomplished if a new file system was created from the ground up to take advantage of the latest advances in technology and, more importantly, had an experienced engineering team that had done this once before? That is, in fact, what Qumulo has done with the Qumulo Core data-aware scale-out NAS software, powered by its new file system, QSFS (Qumulo Scalable File System). Qumulo’s three co-founders were the primary inventors of OneFS – Peter Godman, Neal Fachan, Aaron Passey – and they assembled some of the brightest minds in the storage industry to create a new modern file system designed to support the requirements of today’s datacenter, not the datacenter of decades ago.
Qumulo embraced from day one an agile software development and release model to create their product. This allows them to push out fully tested and certified releases every two weeks. Doing this allows for new feature releases and bug fixes that can be seamlessly introduced to the system as soon as they are ready – not based on an arbitrary 6, 12 or even 18-month release schedule.
Flash storage has radically changed the face of the storage industry. All of the major file systems in use today were designed to work with HDD devices that could produce 150 IOPS; if you were willing to sacrifice capacity and short-stroke them you might get twice that. Now flash is prevalent in the industry, and we have commodity flash devices that can produce up to 250,000 IOPs. Traditional file systems were optimized for slower HDD drives – not to take advantage of the lower latency and higher performance of today’s solid state drives. Many traditional file systems and storage arrays have devised ways to “bolt on” SSD devices to their storage to boost their performance. However, their initial architecture was based around the capabilities of yesterday’s HDD drives and not the capabilities of today’s new flash technology.
An explosion in scale-out large capacity file systems has empowered enterprises to do very interesting things, but has also come with some very interesting problems. Even one of the most trivial tasks—determining how much space the files on a file system are consuming—is very complicated to answer on first-generation file systems. Other questions that are difficult to answer without being aware of the data on a file system include finding out who is consuming the most space on a file system, and which clients and files or applications are consuming the most bandwidth. Second-generation file systems need to be designed to be data-aware, not just storage-aware.
In order to reach performance targets, traditional high performance storage arrays were designed to take advantage of an ASIC-optimized architecture. ASIC architecture is good in the sense that it can speed up performance for some storage related operations; however, this benefit comes at a heavy price – both in dollars and in flexibility. It can take years and millions of dollars to embed new features in an ASIC. By using very powerful and relative inexpensive x86 processors new features can be introduced very quickly via software. The slight performance advantage of ASIC-based storage is disappearing fast as x86 processors get more cores (Intel E5-2600 V3 has up to 18 cores) and advanced features.
When Qumulo approached us to take a look at the world’s first data-aware, scale-out enterprise-grade storage system we welcomed the opportunity. Qumulo’s new storage system is not based on an academic project or designed around an existing storage system, but has been designed and built on entirely new code that the principals at Qumulo developed based on what they learned in interviews with more than 600 storage professionals. What they came up with after these conversations was a new data-aware, scale-out NAS file system designed to take advantage of the latest advances in technology. We were interested in finding out how this file system would work in today’s data center.
The IT industry is in the middle of a massive transition toward simplification and efficiency around managing on-premise infrastructure at today’s enterprise data centers. In the past few years there has been a rampant onset of technology clearly focused at simplifying and radically changing the economics of traditional enterprise infrastructure. These technologies include Public/Private Clouds, Converged Infrastructure, and Integrated Systems to name a few. All of these technologies are geared to provide more efficiency of resources, take less time to administer, all at a reduced TCO. However, these technologies all rely on efficiency and simplicity of the underlying technologies of Compute, Network, and Storage. Often times the overall solution is only as good as the weakest link in the chain. The storage tier of the traditional infrastructure stack is often considered the most complex to manage.
This technology validation focuses on measuring the efficiency and management simplicity by comparing two industry leading mid-range external storage arrays configured in the use case of unified storage. Unified storage has been a popular approach to storage subsystems that consolidates both file access and block access within a single external array thus being able to share the same precious drive capacity resources across both protocols simultaneously. Businesses value the ability to send server workloads down a high performance low latency block protocol while still taking advantage of simplicity and ease of sharing file protocols to various clients. In the past businesses would have either setup a separate file server in front of their block array or buy completely separate NAS devices, thus possibly over buying storage resource and adding complexity. Unified storage takes care of this by providing ease of managing one storage device for all business workload needs. In this study we compared the attributes of storage efficiency and ease of managing and monitoring an EMC VNX unified array versus an HP 3PAR StoreServ unified array. The approach we used was to setup two arrays side-by-side and recorded the actual complexity of managing each array for file and block access, per the documents and guides provided for each product. We also went through the exercise of sizing various arrays via publicly available configuration guides to see what the expected storage density efficiency would be for some typically configured systems.
Our conclusion was nothing short of astonishment. In the case of the EMC VNX2 technology, the approach to unification more closely resembles a hardware packaging and management veneer approach than what would have been expected for a second generation unified storage system. HP 3PAR StoreServ on the other hand, in its second generation of unified storage has transitioned the file protocol services from external controllers to completely converged block and file services within the common array controllers. In addition, all the data path and control plumbing is completely internal as well with no need to wire loop back cables between controllers. HP has also made the investment to create a totally new management paradigm based on the HP OneView management architecture, which radically simplifies the administrative approach to managing infrastructure. After performing this technology validation we can state with confidence that HP 3PAR StoreServ 7400c is 2X easier to provision, 2X easier to monitor, and up to 2X more data density efficient than a similarly configured EMC VNX 5600.
Scale Computing was an early proponent of hyperconverged appliances and is one of the innovators in this marketplace. Since the release of Scale Computing’s first hyperconverged appliance, many others have come to embrace the elegance of having storage and compute functionality combined on a single server. Even the virtualization juggernaut VMware has seen the benefits of abstracting, pooling, and running storage and compute on shared commodity hardware. VMware’s current hyperconverged storage initiative, VMware Virtual SAN, seems to be gaining traction in the marketplace. We thought it would be an interesting exercise to compare and contrast Scale Computing’s hyperconverged appliance to a hyperconverged solution built around VMware Virtual SAN. Before we delve into this exercise, however, let’s go over a little background history on the topic.
Taneja Group defines hyperconvergence as the integration of multiple previously separate IT domains into one system in order to serve up an entire IT infrastructure from a single device or system. This means that hyperconverged systems contain all IT infrastructure—networking, compute and storage—while promising to preserve the adaptability of the best traditional IT approaches. Such capability implies an architecture built for seamless and easy scaling over time, in a "grow as needed” fashion.
Scale Computing got its start with scale-out storage appliances and has since morphed these into a hyperconverged appliance—HC3. HC3 was the natural evolution of its well-regarded line of scale-out storage appliances, which includes both a hypervisor and a virtual infrastructure manager. HC3’s strong suit is its ease of use and affordability. The product has seen tremendous growth and now has over 900 deployments.
VMware got its start with compute virtualization software and is by far the largest virtualization company in the world. VMware has always been a software company, and takes pride in its hardware agnosticism. VMware’s first attempt to combine shared direct-attached storage (DAS) storage and compute on the same server resulted in a product called “VMware vSphere Storage Appliance” (VSA), which was released in June of 2011. VSA had many limitations and didn’t seem to gain traction in the marketplace and reached its end of availability (EOA) in June of 2014. VMware’s second attempt, VMware Virtual SAN (VSAN), which was announced at VMworld in 2013, shows a lot of promise and seems to be gaining acceptance, with over 300 paying customers using the product. We will be comparing VMware Virtual SAN to Scale Computing’s hyperconverged appliance, HC3, in this paper.
Here we have two companies: Scale Computing, which has transformed from an early innovator in scale-out storage to a company that provides a hyperconverged appliance; and VMware, which was an early innovator in compute virtualization and since has transformed into a company that provides the software needed to create build-your-own hyperconverged systems. We looked deeply into both systems (HC3 and VSAN) and walked both through a series of exercises to see how they compare. We aimed this review at what we consider a sweet spot for these products: small to medium-sized enterprises with limited dedicated IT staff and a limited budget. After spending time with these two solutions, and probing various facets of them, we came up with some strong conclusions about their ability to provide an affordable, easy to use, scalable solution for this market.
The observations we have made for both products are based on hands-on testing both in our lab and on-site at Scale Computing’s facility in Indianapolis, Indiana. Although we talk about performance in general terms, we do not, and you should not, construe this to be a benchmarking test. We have, in good faith, verified all conclusions made around any timing issues. Moreover, the numbers that we are using are generalities that we believe are widely known and accepted in the virtualization community.
Consolidation and enhanced management enabled by virtualization has revolutionized the practice of IT around the world over the past few years. By abstracting compute from the underlying hardware systems, and enabling oversubscription of physical systems by virtual workloads, IT has been able to pack more systems into the data center than before. Moreover, for the first time in seemingly decades, IT has also taken a serious leap ahead in management, as this same virtual infrastructure has wrapped the virtualized workload with better capabilities than ever before - tools like increased visibility, fast provisioning, enhanced cloning, and better data protection. The net result has been a serious increase in overall IT efficiency.
But not all is love and roses with the virtual infrastructure. In the face of serious benefits and consequent rampant adoption, virtualization continues to advance and bring about more capability. All too often, an increase in capability has come at the cost of complexity. Virtualization now promises to do everything from serving up compute instances, to providing network infrastructure and network security, to enabling private clouds.
For certain, much of this complexity exists between the individual physical infrastructures that IT must touch, and the simultaneous duplication that virtualization often brings into the picture. Virtual and physical networks must now be integrated, the relationship between virtual and physical servers must be tracked, and the administrator can barely answer with certainty whether key storage functions, like snapshots, should be managed on physical storage systems or in the virtual infrastructure.
With challenges surrounding the complexity in managing a virtualized datacenter, Scale Computing, long a provider of scale-out storage, introduced a new line of hyperconverged appliances - HC3 in April, 2012 and updated the appliances with the new HyperCore software in May, 2014. HC3 is an integration of storage and virtualized compute within a scale-out building block architecture that couples all of the elements of a virtual data center together inside a hyperconverged appliance. The result is a system that is simple to use and does away with much of the complexity associated with virtualization in the data center. By virtualizing and intermingling compute and storage inside a system that is designed for scale-out, HC3 does away with the need to manage virtual networks, assemble complex compute clusters, provision and manage storage, and a bevy of other day to day administrative tasks. Provisioning additional resources - any resource - becomes one-click-easy, and adding more physical resources as the business grows is reduced to a simple 2-minute exercise.
While this sounds compelling on the surface, Taneja Group recently turned our Technology Validation service - our hands-on lab service - to the task of evaluating whether Scale Computing's HC3 could deliver on these promises in the real world. For this task, we put an HC3 cluster through the paces to see how well it deployed, how it held up under use, and what special features it delivered that might go beyond the features found in traditional integrations of discreet compute and storage systems.
Storage performance has long been the bane of the enterprise infrastructure. Fortunately, in the past couple of years, solid-state technologies have allowed new comers as well as established storage vendors to start shaping up clever, cost effective, and highly efficient storage solutions that unlock greater storage performance. It is our opinion that the most innovative of these solutions are the ones that require no real alteration in the storage infrastructure, nor a change in data management and protection practices.
This is entirely possible with server-side caching solutions today. Server-side caching solutions typically use either PCIe solid-state NAND Flash or SAS/SATA SSDs installed in the server alongside a hardware or software IO handler component that mirrors commonly utilized data blocks onto the local high speed solid-state storage. Then the IO handler redirects server requests for data blocks to those local copies that are served up with lower latency (microseconds instead of milliseconds) and greater bandwidth than the original backend storage. Since data is simply cached, instead of moved, the solution is transparent to the infrastructure. Data remains consolidated on the same enterprise infrastructure, and all of the original data management practices – such as snapshots and backup – still work. Moreover, server-side caches can actually offload IO from the backend storage system, and can allow a single storage system to effectively serve many more clients. Clearly there’s tremendous potential value in a solution that can be transparently inserted into the infrastructure and address storage performance problems.