Storing digital data has long been a perilous task. Not only are stored digital bits subject to the catastrophic failure of the devices they rest upon, but the nature of shared digital bits subjects them to error and even intentional destruction. In the virtual infrastructure, the dangers and challenges subtly shift. Data is more highly consolidated and more systems depend wholly on shared data repositories; this increases data risks. Many virtual machines connecting to single shared storage pools mean that IO or storage performance has become an incredibly precious resource; this complicates backup, and means that backup IO can cripple a busy infrastructure. Backup is a more important operation than ever before, but it is also fundamentally more challenging than ever before.
Fortunately, the industry rapidly learned this lesson in the earlier days of virtualization, and has aggressively innovated to bring tools and technologies to bear on the challenge of backup and recovery for virtualized environments. APIs have unlocked more direct access to data, and products have finally come to market that make protection easier to use, and more compatible with the dynamic, mobile workloads of the virtual data center. Nonetheless differences abound between product offerings, often rooted in the subtleties of architecture – architectures that ultimately determine whether a backup product is best suited for SMB-sized needs, or whether a solution can scale to support the large enterprise.
Moreover, within the virtual data center, TCO centers on resource efficiency, and a backup strategy can be one of the most significant determinants of that efficiency. On one hand, traditional backup just does not work and can cripple efficiency. There is simply too much IO contention and application complexity in trying to convert a legacy physical infrastructure backup approach to the virtual infrastructure. On the other hand, there are a number of specialized point solutions designed to tackle some of the challenges of virtual infrastructure backup. But too often, these products do not scale sufficiently, lack consolidated management, and stand to impose tremendous operational overhead as the customer’s environment and data grows. When taking a strategic look at the options, it often looks like backup approaches fly directly in the face of resource efficiency.
Virtual Storage Appliances (VSAs) have been around for a while – just over 5 years ago, the earliest vendors started to sample market interest in this technology. In theory, the market was interested, but perhaps more so on paper than in actual adoption during those early days. Regardless, that interest drove more vendors to release VSAs and today there are dozens of Virtual Storage Appliances on the market. Many of these are focused on capabilities such as backup, but at least a handful can serve as primary storage beneath the virtual infrastructure.
The primary storage VSAs on the market came about as product or marketing experiments; perhaps to let customers experience a storage system without making a full investment, allow customers to ingest rogue virtual infrastructure storage back into their existing storage infrastructure, or enable consistent storage management as customers deployed workloads with remote service providers.
For certain, many of these primary storage VSAs have never found their footing, and still languish as a neglected technology in a dusty corner of a vendor’s product portfolio. But there have been exceptions. One is HP StoreVirtual. HP has been quite serious about delivering StoreVirtual as a real storage solution with hefty capabilities. StoreVirtual is one of HP’s several converged storage technologies that is blurring the boundaries between storage and compute, and helping customer infrastructures to scale and adapt while maintaining maximum efficiency. The popular StoreVirtual product line comes in a variety of physical formats, from entry-level 1U 4 drive systems to extremely dense BladeSystem SANs. Approximately 5 years ago, the StoreVirtual software foundation was also released in Virtual Storage Appliance form. This StoreVirtual VSA is a full storage system that looks, acts, and functions just like its physical StoreVirtual brethren. The intent behind HP’s StoreVirtual VSA is increased ease of use, increased storage functionality in the virtual infrastructure, and greater adaptability, within a dense footprint that can make use of any available storage resources (direct attached server storage or networked storage). HP claims that StoreVirtual VSA leads the market in ease of use, performance, efficiency, and storage capabilities – all of which makes it ideally positioned to service primary workloads in the data center.
In this Technology Validation, we set out to examine StoreVirtual VSA, and through comparison to another leading virtual storage appliance (VMware’s vSphere Storage Appliance – VMware VSA) evaluate the effectiveness of StoreVirtual VSA’s architecture in enabling superior, primary-workload-ready storage in the virtual infrastructure. With an eye on ease of use, efficiency, and flexibility, we put StoreVirtual VSA and VMware vSphere Storage Appliance through a detailed examination that included both a review of functionality and a hands-on lab examination of performance, scalability, resiliency, and ease of use.
Just a couple of years ago, as solid-state storage technologies began finding significant mainstream adoption, Taneja Group began closely following a vendor whose architectural roadmap seemed to destine them to be the pre-eminent architectural leader for scale-out, high performance, enterprise-ready, cost-effective solid-state arrays.
That vendor was Kaminario, who first entered the market with a highly resilient, scale-out architecture that promised extreme performance with more linear scalability as well as superior availability / serviceability versus other offerings we then saw on the market.
In the past couple of years, Kaminario has continued advancing their technology in both performance and features, systematically adding the mainstream features that the enterprise demands – and are that are too often missing on high performance storage systems: features like snapshots, utilization reporting, resiliency that tolerates full node failures, and more.
In turn, Kaminario recently drew the attention of Taneja Group Labs. Scale-out and enterprise-class storage management features are not easy to architect (especially not together), and we wanted to know whether Kaminario could deliver enterprise-class wrappings with all of their historic scale-out capabilities.
Deduplication took the market by storm several years ago, and backup hasn’t been the same since. With the ability to eradicate duplicate data in duplication-prone backups, deduplication made it practical to store large amounts of backup data on disk instead of tape. In short order, a number of vendors marched into the market spotlight offering products with tremendous efficiency claims, great throughput rates, and greater tolerance for the too often erratic throughput of backup jobs that was a thorn in the side for traditional tape. Today, deduplicating backup storage appliances are a common site in data centers of all types and sizes.
But deduplicating data is a tricky science. It is often not as simple as just finding matching runs of similar data. Backup applications and modifications to data can sprinkle data streams with mismatched bits and pieces, making deduplication much more challenging. The problem is worst for Virtual Tape Libraries (VTLs) that emulate traditional tape. Since they emulate tape, backup applications use all of their traditional tape formatting. Such formatting is designed to compensate for tape shortcomings and allow faster and better application access to data on tape, but it creates noise for deduplication.
The best products on the market recognize this challenge and have built “parsers” for every backup application – technology that recognizes the metadata within the backup stream and enables the backup storage appliance to read around it.
In 2012, IBM introduced a parser for IBM’s leading backup application Tivoli Storage Manager (TSM) in their ProtecTIER line of backup storage solutions. TSM has long had a reputation for a noisy tape format. That format enables richer data interaction than many competitors, but it creates enormous challenges for deduplication.
At IBM’s invitation, in November of 2012, Taneja Group Labs put ProtecTIER through the paces to evaluate whether this parser for the ProtecTIER family makes a difference. Our findings: Clearly it does; in our highly structured lab exercise, ProtecTIER looked fully poised to deliver advertised deduplication for TSM environments. In our case, we observed a reasonable 10X to 20X deduplication range for real world Microsoft Exchange data.
Over the past few years, backup has become a busy market. For the first time in many years, a new wave of energy hit this market as small innovators sprang forth to try to tackle pressing challenges around virtual server backup. The market has taken off because of a unique set of challenges and simultaneous opportunities within the virtual infrastructure – with large amounts of highly similar data, interesting APIs for automation, and a uniquely limited set of IO and processing resources, the data behind the virtual server can be captured and protected in unique new ways. As innovators in turn attacked these opportunities, backup has been fundamentally changed. In many cases, backup has been put in the hands of the virtual infrastructure administrator, made lighter weight and vastly more accessible, and has become a powerful tool for data protection and data management.
In reality, the innovations with virtual backup have leveraged the unifying layer of virtualization to tackle several key backup challenges. These challenges have been long-standing in the practice of data protection, and include ever-tightening backup windows, ever more demanding recovery point objectives (RPO or the amount of tolerable data loss when recovering), short recovery time objectives (RTO or how long it takes to complete a recovery), recovery reliability, and complexity. Specialized data protection for the virtual infrastructure has made enormous progress in tackling these challenges, and simplifying the practice of data protection to boot.
But we’ve often wondered what it would take to bring the innovation from virtual infrastructure protection to a full-fledged backup product that could tackle both physical and virtual systems. At the recent request of Dell, Taneja Group Labs had the opportunity to look at just such a product. That product is AppAssure – a set of technology that seems destined to be the future architectural anchor for the many data protection technologies in Dell’s rapidly growing product portfolio. We jumped at the chance to run AppAssure through the paces in a hands-on exercise, as we wanted to see whether AppAssure had an architecture that might be poised to change how datacenter-wide protection is typically done, perhaps by making it more agile and accessible.
The past few years have seen virtualization rapidly move into the mainstream of the data center. Today, virtualization is often the defacto standard in the data center for deployment of any application or service. This includes important operational and business systems that are the lifeblood of the business.
For mission critical systems, customers necessarily demand a broader level of services than is common among the test and development environments where virtualization often gains its foothold in the data center. It goes almost without saying that topmost in customer’s minds are issues of availability.
Availability is a spectrum of technology that offers businesses many different levels of protection – from general recoverability to uninterruptable applications. At the most fundamental level, are mechanisms that protect the data and the server beneath applications. While in the past these mechanisms have often been hardware and secondary storage systems, VMware has steadily advanced the capabilities of their vSphere virtualization offering, and it includes a long list of features – vMotion, Storage vMotion, vSphere Replication, VMware vCenter Site Recovery Manager, vSphere High Availability, and vSphere Fault Tolerance. While clearly VMware is serious about the mission critical enterprise, each of these offerings have retained a VMware-specific orientation toward protecting the “compute instance”.
The challenge is that protecting a compute instance does not go far enough. It is the application that matters, and detecting VM failures may fall short of detecting and mitigating application failures.
With this in mind, Symantec has steadily advanced a range of solutions for enhancing availability protection in the virtual infrastructure. Today this includes ApplicationHA – developed in partnership with VMware – and their gold standard offering of Veritas Cluster Server (VCS) enhanced for the virtual infrastructure. We recently turned an eye toward how these solutions enhance virtual availability in a hands-on lab exercise, conducted remotely from Taneja Group Labs in Phoenix, AZ. Our conclusion: VCS is the only HA/DR solution that can monitor and recover applications on VMware that is fully compatible with typical vSphere management practices such as vMotion, Dynamic Resource Scheduler and Site Recovery Manager, and it can make a serious difference in the availability of important applications.