Includes Storage Arrays, NAS, File Systems, Clustered and Distributed File Systems, FC Switches/Directors, HBA, CNA, Routers, Components, Semiconductors, Server Blades.
Taneja Group analysts cover all form and manner of storage arrays, modular and monolithic, enterprise or SMB, large and small, general purpose or specialized. All components that make up the SAN, FC-based or iSCSI-based, and all forms of file servers, including NAS systems based on clustered or distributed file systems, are covered soup to nuts. Our analysts have deep backgrounds in file systems area in particular. Components such as Storage Network Processors, SAS Expanders, FC Controllers are covered here as well. Server Blades coverage straddles this section as well as the Infrastructure Management section above.
Let's face it: Today’s storage is dumb. Mostly it is a dumping ground for data. As we produce more data we simply buy more storage and fill it up. We don't know who is using what storage at a given point in time, which applications are hogging storage or have gone rogue, what and how much sensitive information is stored, moved or accessed by whom, and so on. Basically, we are blind to whatever is happening inside that storage array. On the other hand, storage should just work, users of storage should see it as an endless invisible resource, while the administrators of storage should be able to unlock the value of data itself through real-time analytical insight, not fighting fires just to keep storage running and provisioned.
Storage systems these days are often quoted in petabytes and will eventually move to exabytes and beyond. Businesses are being crushed under the weight of this data sprawl and a new tsunami of data is coming their way as the Internet of Things fully comes online in the next decade. How are administrators dealing with this ever increasing appetite to store more data? It is time for a radical new approach to building a storage system, one that is aware of the information stored within while dramatically reducing the time administrators spend managing the system.
Welcome to the new era of data aware storage. This could not have come at a better time. Storage growth, as we all know, is out of control. Granted the cost per GB keeps falling at about a 40% per year rate, but we keep growing capacity at about a 60% growth rate. This causes both the cost and capacity to keep increasing every year. While cost increase is certainly an issue, the bigger issue is manageability. And not knowing what we have buried in those mounds of data is a bigger issue. Instead of data being an asset, it is a dead weight that keeps getting heavier. If we didn’t do something about it, we would simply be overwhelmed, if we are not already.
The question we ask is why is it possible to develop data aware storage today when we couldn’t yesterday? The answer is simple: flash technology, virtualization, and the availability of “free” CPU cycles make it possible for us to build storage today that can do a lot of heavy lifting from the inside. While this was possible yesterday, if implemented, it would have slowed down the performance of primary storage to a point where it would be useless. So, in the past, we simply let it store data. But today, we can build in a lot of intelligence without impacting performance or quality of service. We call this new type of storage Data Aware Storage.
When implemented correctly, data aware storage can provide insights that were not possible yesterday. It would reduce risk for non-compliance. It would improve governance. It would automate many of the storage management processes that are manual today. It would provide insights into how well the storage is being utilized. It would identify if a dangerous situation was about to occur, either for compliance or capacity or performance or SLA. You get the point. Storage that is inherently smart and knows: what type of data it has, how it is growing, who is using it, who is abusing it, and so on.
In this profile, we dive deep into a new technology, called Qumulo Core, the industry’s first data-aware scale-out NAS platform. Qumulo Core promises to radically change the scale-out NAS product category by using built-in data awareness to massively scale a distributed file system, while at the same time radically reducing the time to administer a system than can hold billions of files. File systems in the past could not scale to this level because administrative tools would crush under the weight of the system.
Converged infrastructure systems – the integration of compute, networking, and storage - have rapidly become the preferred foundational building block adopted by businesses of all shapes and sizes. The success of these systems has been driven by an insatiable desire to make IT simpler, faster, and more efficient. IT can no longer afford the effort and time to custom build their infrastructure from best of breed D-I-Y components. Purpose built converged infrastructure systems have been optimized for the most common IT workloads like Private Cloud, Big Data, Virtualization, Database and Desktop Virtualization (VDI).
Traditionally these converged infrastructure systems have been built using a three-tier architecture; where compute, networking and storage, integrated at the rack level gave businesses the flexibility to cover the widest range of solution workload requirements while still using well-known infrastructure components. Emerging onto the scene recently has been a more modular approach to convergence using what we term Hyper-Convergence. With hyper-convergence, the three-tier architecture has been collapsed into a single system appliance that is purpose-built for virtualization with hypervisor, compute, and storage with advanced data services all integrated into an x86 industry-standard building block.
In this paper we will examine the ideal solution environments where Hyper-Converged products have flourished. We will then give practical guidance on solution positioning for HP’s latest ConvergedSystem Hyper-Converged product offerings.
All-flash arrays are changing the datacenter for the better. No longer do we worry about IOPS bottlenecks from the array: all-flash arrays (AFA) can deliver a staggering amount of IOPs. AFAs with the ability to deliver hundreds of thousands of IOPs are not uncommon. The problem now, however, is how to get the IOPS from the array to the servers. We recently had a chance to see how well an AFA using EMC PowerPath driver works to eliminate this bottleneck—and we were blown away. Most comparisons with datacenter infrastructure show a 10-30% improvement in performance; but, the performance improvement that we saw with PowerPath was extraordinary.
Getting bits from an array to server is easy —very easy, in fact. The trick is getting the bits from a server to an array in an efficient manner when you have many virtual machines (VM) on multiple physical hosts that are transmitting the bits over a physical network with a virtual fabric overlay; this is much more difficult. Errors can get introduced and must be dealt with, the most efficient path must be obtained and established, re-evaluated and reestablished continually, and any misconfiguration can produce less than optimal performance. In some cases, this can cause outages or even data loss. In order to deal with the “pathing,” or how the I/O travels from the VM to storage, the OS running on the host needs a driver, or in the case where multiple paths can be taken from the server to the array, a multipathing driver needs to be used to direct the traffic.
Windows, Linux, VMware and most other modern operating systems include a basic multipath driver; however, these drivers tend to be generic and not code optimized to extract the maximum performance from an array and come with only rudimentary traffic optimization and management functions. In some cases these generic drivers are fine, but in the majority of datacenters the infrastructure is overtaxed and its equipment needs to be used in the most efficient manner possible. Fortunately, storage companies such as EMC are committed to making their arrays work as performant as possible and spend a considerable amount of time and research to develop multipathing drivers optimized for their arrays. EMC invited us to take a look at how PowerPath, their optimized “intelligent” multipath driver, performed on an XtremIO flash array connected to a Dell PowerEdge R710 server running ESXi 6.0 while simulating an Oracle workload. We looked at the results of the various tests EMC ran comparing PowerPath/VE multipath driver against VMware’s ESXi Native Multipath driver and we were impressed—very impressed—by the difference that an optimized, multipath driver like PowerPath can make in a high IO traffic scenario.
Although server virtualization provides enormous benefits for the modern data center, it can also be daunting from a storage perspective. Provisioning storage to match the exact virtual machine (VM) requirements has always been challenging, often ending in a series of compromises. Each VM will have its own unique performance and storage requirements; this can create over-provisioning and other inefficient uses of storage. A virtualization administrator has to try and match a VM’s storage requirements as close as possible to storage that has been pre-provisioned. Provisioning in this way is time consuming and does not possess the VM-level granularity required to meet the specific needs of the applications running in a VM. To exacerbate the problem, often, over the lifetime of a VM, the storage requirements for a VM will change, which requires ongoing diligence and review of the storage platform and manual intervention to meet the new requirements.
VMware vSphere 6.0 introduced the biggest change to the ESXi storage stack since the company’s inception with the inclusion of vSphere Virtual Volumes (VVol). VVol helps to solve the challenge of how to match a VM’s storage requirements with external storage capabilities on a per VM basis. We found that VVol when combined with Dell EqualLogic PS Series arrays become a powerful force in the datacenter.
In this brief, we call out some storage-related highlights of vSphere 6.0 such as Virtual Volumes, and then take a close look at how they, as well as traditional datastore stored VMs have been enhanced and packaged by Dell Storage into the EqualLogic PS Series storage solution. We will show how Dell and VMware have combined forces to deliver an enterprise-class virtual server and storage environment that is highly optimized and directly addresses the performance, availability, data protection and complexity challenges common in today’s business-critical virtualized data centers.
All businesses have a core set of applications and services that are critical to their ongoing operation and growth. They are the lifeblood of a business. Many of these applications and services are run in virtual machines (VM), as over the last decade virtualization has become the de facto standard in the datacenter for the deployment of applications and services. Some applications and services are classified as business critical. These business critical applications require a higher level of resilience and protection to minimize the impact on a business’s operation if they become inoperable.
The ability to quickly recover from an application outage has become imperative in today’s datacenter. There are various methods that offer different levels of protection to maintain application uptime. These methods range from minimizing the downtime at the application level to virtual machine (VM) recovery to physical system recovery. Prior to virtualization, mechanisms were in place to protect physical systems and were based on having secondary hardware and redundant storage systems. However, as noted above, today most systems have been virtualized. The market leader in virtualization, VMware, recognized the importance of availability early on and created business continuity features in vSphere such as vMotion, Storage vMotion, vSphere Replication, vCenter Site Recovery Manager (SRM), vSphere High Availability (HA) and vSphere Fault Tolerance (FT). These features have indeed increased the uptime of applications in the enterprise, yet they are oriented toward protecting the VM. The challenge, as many enterprises have discovered, is that protecting the VM alone does not guarantee uptime for applications and services. Detecting and remediating VM failure falls short of what is truly vital, detecting and remediating application and service failures.
With application and service availability in mind, companies such as Symantec have come in to provide availability and resiliency for them. Focusing on improving how VMware can deliver application availability, Symantec has developed a set of solutions to meet the high availability and disaster recovery requirements of business critical applications. These solutions include Symantec ApplicationHA (developed in partnership with VMware) and Symantec Cluster Server powered by Veritas (VCS). Both of these products have been enhanced to work in a VMware-based virtual infrastructure environment.
New enterprise-grade file systems don’t come around very often. Over the last two decades we have seen very few show up; ZFS was introduced in 2004, Isilon’s OneFS in 2003, Lustre in 2001, and WAFL in 1992. There is a good reason behind this: the creation of a unique enterprise-grade file system is not a trivial undertaking and takes considerable resources and vision. During the last ten years, we have seen seismic changes in the data center and storage industry. Today’s data center runs a far different workload than what was prevalent when these first-generation file systems were developed. For example, today’s data center runs virtual machines, and it is the exception when there is a one-to-one correlation between a server and storage. Databases span the largest single disk drives. A huge amount of data is ingested by big data applications, social media. Data must be kept in order to meet the requirements of government and corporate policy. Technology has dramatically changed over the last decade. For instance, flash memory has become prevalent, commodity x86 processors now rival ASIC chips in power and performance, and new software development and delivery methodologies such as “agile” have become mainstream. In the past, we were concerned with how to deal with the underlying storage, but now we are concerned with how to deal with this huge amount of data that we have stored.
What could be accomplished if a new file system was created from the ground up to take advantage of the latest advances in technology and, more importantly, had an experienced engineering team that had done this once before? That is, in fact, what Qumulo has done with the Qumulo Core data-aware scale-out NAS software, powered by its new file system, QSFS (Qumulo Scalable File System). Qumulo’s three co-founders were the primary inventors of OneFS – Peter Godman, Neal Fachan, Aaron Passey – and they assembled some of the brightest minds in the storage industry to create a new modern file system designed to support the requirements of today’s datacenter, not the datacenter of decades ago.
Qumulo embraced from day one an agile software development and release model to create their product. This allows them to push out fully tested and certified releases every two weeks. Doing this allows for new feature releases and bug fixes that can be seamlessly introduced to the system as soon as they are ready – not based on an arbitrary 6, 12 or even 18-month release schedule.
Flash storage has radically changed the face of the storage industry. All of the major file systems in use today were designed to work with HDD devices that could produce 150 IOPS; if you were willing to sacrifice capacity and short-stroke them you might get twice that. Now flash is prevalent in the industry, and we have commodity flash devices that can produce up to 250,000 IOPs. Traditional file systems were optimized for slower HDD drives – not to take advantage of the lower latency and higher performance of today’s solid state drives. Many traditional file systems and storage arrays have devised ways to “bolt on” SSD devices to their storage to boost their performance. However, their initial architecture was based around the capabilities of yesterday’s HDD drives and not the capabilities of today’s new flash technology.
An explosion in scale-out large capacity file systems has empowered enterprises to do very interesting things, but has also come with some very interesting problems. Even one of the most trivial tasks—determining how much space the files on a file system are consuming—is very complicated to answer on first-generation file systems. Other questions that are difficult to answer without being aware of the data on a file system include finding out who is consuming the most space on a file system, and which clients and files or applications are consuming the most bandwidth. Second-generation file systems need to be designed to be data-aware, not just storage-aware.
In order to reach performance targets, traditional high performance storage arrays were designed to take advantage of an ASIC-optimized architecture. ASIC architecture is good in the sense that it can speed up performance for some storage related operations; however, this benefit comes at a heavy price – both in dollars and in flexibility. It can take years and millions of dollars to embed new features in an ASIC. By using very powerful and relative inexpensive x86 processors new features can be introduced very quickly via software. The slight performance advantage of ASIC-based storage is disappearing fast as x86 processors get more cores (Intel E5-2600 V3 has up to 18 cores) and advanced features.
When Qumulo approached us to take a look at the world’s first data-aware, scale-out enterprise-grade storage system we welcomed the opportunity. Qumulo’s new storage system is not based on an academic project or designed around an existing storage system, but has been designed and built on entirely new code that the principals at Qumulo developed based on what they learned in interviews with more than 600 storage professionals. What they came up with after these conversations was a new data-aware, scale-out NAS file system designed to take advantage of the latest advances in technology. We were interested in finding out how this file system would work in today’s data center.