Primary storage is often defined as storage hosting mission-critical applications with tight SLAs, requiring high performance. Secondary storage is where everything else typically ends up and, unfortunately, data stored there tends to accumulate without much oversight. Most of the improvements within the overall storage space, most recently driven by the move to hyperconverged infrastructure, have flowed into primary storage. By shifting the focus from individual hardware components to commoditized, clustered and virtualized storage, hyperconvergence has provided a highly-available virtual platform to run applications on, which has allowed IT to shift their focus from managing individual hardware components and onto running business applications, increasing productivity and reducing costs.
Companies adopting this new class of products certainly enjoyed the benefits, but were still nagged by a set of problems that it didn’t address in a complete fashion. On the secondary storage side of things, they were still left dealing with too many separate use cases with their own point solutions. This led to too many products to manage, too much duplication and too much waste. In truth, many hyperconvergence vendors have done a reasonable job at addressing primary storage use cases, , on their platforms, but there’s still more to be done there and more secondary storage use cases to address.
Now, however, a new category of storage has emerged. Hyperconverged Secondary Storage brings the same sort of distributed, scale-out file system to secondary storage that hyperconvergence brought to primary storage. But, given the disparate use cases that are embedded in secondary storage and the massive amount of data that resides there, it’s an equally big problem to solve and it had to go further than just abstracting and scaling the underlying physical storage devices. True Hyperconverged Secondary Storage also integrates the key secondary storage workflows - Data Protection, DR, Analytics and Test/Dev - as well as providing global deduplication for overall file storage efficiency, file indexing and searching services for more efficient storage management and hooks into the cloud for efficient archiving.
Cohesity has taken this challenge head-on.
Before delving into the Cohesity Data Platform, the subject of this profile and one of the pioneering offerings in this new category, we’ll take a quick look at the state of secondary storage today and note how current products haven’t completely addressed these existing secondary storage problems, creating an opening for new competitors to step in.
There are a lot of game-changing trends in IT today including mobility, cloud, and big data analytics. As a result, IT architectures, data centers, and data processing are all becoming more complex – increasingly dynamic, heterogeneous, and distributed. For all IT organizations, achieving great success today depends on staying in control of rapidly growing and faster flowing data.
While there are many ways for IT technology and solution providers to help clients depending on their maturity, size, industry, and key business applications, every IT organization has to wrestle with BURA (Backup, Recovery, and Archiving). Protecting and preserving the value of data is a key business requirement even as the types, amounts, and uses of that data evolve and grow.
For IT organizations, BURA is an ever-present, huge, and growing challenge. Unfortunately, implementing a thorough and competent BURA solution often requires piecing and patching together multiple vendor products and solutions. These never quite fully address the many disparate needs of most organizations nor manage to be very simple or cost-effective to operate. Here is where we see HPE as a key vendor today with all the right parts coming together to create a significant change in the BURA marketplace.
First, HPE is pulling together its top-notch products into a user-ready “solution” that marries both StoreOnce and Data Protector. For those working with either or both of those separately in the past in conjunction with other vendor’s products, it’s no surprise that they each compete favorably one-on-one with other products in the market, but together as an integrated joint solution they beat the best competitor offerings.
But HPE hasn’t just bundled products into solutions, it is undergoing a seismic shift in culture that revitalizes its total approach to market. From product to services to support, HPE people have taken to heart a “customer first” message to provide a truly solution-focused HPE experience. One support call, one ticket, one project manager, addressing the customer’s needs regardless of what internal HPE business unit components are in the “box”. And significantly, this approach elevates HPE from just being a supplier of best-of-breed products into an enterprise-level trusted solution provider addressing business problems head-on. HPE is perhaps the only company completely able to deliver a breadth of solutions spanning IT from top to bottom out of their own internal world-class product lines.
In this report, we’ll examine first why HPE StoreOnce and Data Protector products are truly game changing on their own rights. Then, we will look at why they get even “better together” as a complete BURA solution that can be more flexibly deployed to meet backup challenges than any other solution in the market today.
Full Database Protection Without the Full Backup Plan: Oracle’s Cloud-Scaled Zero Data Loss Recovery
Today’s tidal wave of big data isn’t just made up of loose unstructured documents – huge data growth is happening everywhere including in high-value structured datasets kept in databases like Oracle Database 12c. This data is any company’s most valuable core data that powers most key business applications – and it’s growing fast! According to Oracle, in 5 years (by 2020) most enterprises expect 50x data growth. As their scope and coverage grow, these key databases inherently become even more critical to our businesses. At the same time, the sheer number of database-driven applications and users is also multiplying – and they increasingly need to be online, globally, 24 x 7. Which all leads to the big burning question: How can we possibly protect all this critical data, data we depend on more and more even as it grows, all the time?
We just can’t keep taking more time out of the 24-hour day for longer and larger database backups. The traditional batch window backup approach is already often beyond practical limits and its problems are only getting worse with data growth – missed backup windows, increased performance degradation, unavailability, fragility, risk and cost. It’s now time for a new data protection approach that can do away with the idea of batch window backups, yet still provide immediate backup copies to recover from failures, corruption, and other disasters.
Oracle has stepped up in a big way, and marshaling expertise and technologies from across their engineered systems portfolio, has developed a new Zero Data Loss Recovery Appliance. Note the very intentional name that is focused on total recoverability – the Recovery Appliance is definitely not just another backup target. This new appliance eliminates the pains and risks of the full database backup window approach completely through a highly engineered continuous data protection solution for Oracle databases. It is now possible to immediately recover any database to any point in time desired, as the Recovery Appliance provides “virtual” full backups on demand and can scale to protect thousands of databases and petabytes of capacity. In fact, it offloads backup processes from production database servers which can increase performance in Oracle environments typically by 25%. Adopting this new backup and recovery solution will actually give CPU cycles back to the business.
In this report, we’ll briefly review why conventional data protection approaches based on the backup window are fast becoming obsolete. Then we’ll look into how Oracle has designed the new Recovery Appliance to provide a unique approach to ensuring data protection in real-time, at scale, for thousands of databases and PBs of data. We’ll see how zero data loss, incremental forever backups, continuous validation, and other innovations have completely changed the game of database data protection. For the first time there is now a real and practical way to fully protect a global corporation’s databases—on-premise and in the cloud—even in the face of today’s tremendous big data growth.
One of the biggest storage trends we are seeing in our current research here at Taneja Group is that of storage buyers (and operators) looking for more functionality – and at the same time increased simplicity – from their storage infrastructure. For this and many other reasons, including TCO (both CAPEX and OPEX) and improved service delivery, functional “convergence” is currently a big IT theme. In storage we see IT folks wanting to eliminate excessive layers in their complex stacks of hardware and software that were historically needed to accomplish common tasks. Perhaps the biggest, most critical, and unfortunately onerous and unnecessarily complex task that enterprise storage folks have had to face is that of backup and recovery. As a key trusted vendor of both data protection and storage solutions, we note that HPE continues to invest in producing better solutions in this space.
HPE has diligently been working towards integrating data protection functionality natively within their enterprise storage solutions starting with the highly capable tier-1 3PAR StoreServ arrays. This isn’t to say that the storage array now turns into a single autonomous unit, becoming a chokepoint or critical point of failure, but rather that it becomes capable of directly providing key data services to downstream storage clients while being directed and optimized by intelligent management (which often has a system-wide or larger perspective). This approach removes excess layers of 3rd party products and the inefficient indirect data flows traditionally needed to provide, assure, and then accelerate comprehensive data protection schemes. Ultimately this evolution creates a type of “software-defined data protection” in which the controlling backup and recovery software, in this case HPE’s industry-leading Data Protector, directly manages application-centric array-efficient snapshots.
In this report we examine this disruptively simple approach and how HPE extends it to the virtual environment – converging backup capabilities between Data Protector and 3PAR StoreServ to provide hardware assisted agentless backup and recovery for virtual machines. With HPE’s approach, offloading VM-centric snapshots to the array while continuing to rely on the hypervisor to coordinate the physical resources of virtual machines, virtualized organizations gain on many fronts including greater backup efficiency, reduced OPEX, greater data protection coverage, immediate and fine-grained recovery, and ultimately a more resilient enterprise. We’ll also look at why HPE is in a unique position to offer this kind of “converging” market leadership, with a complete end-to-end solution stack including innovative research and development, sales, support, and professional services.
While a few well-publicized web 2.0 companies are taking great advantage of foundational big data solution that they have themselves created (e.g. Hadoop), most traditional enterprise IT shops are still thinking about how to practically deploy their first business-impacting big data applications – or have dived in and are now struggling mightily to effectively manage a large Hadoop cluster in the middle of their production data center. This has led to the common perception that realistic big data business value may yet be just out of reach for most organizations – especially those that need to run lean and mean on both staffing and resources.
This new big data ecosystem consists of scale-out platforms, cutting-edge open source solutions, and massive storage that is inherently difficult for traditional IT shops to optimally manage in production – especially with still evolving ecosystem management capabilities. In addition, most organizations need to run large clusters supporting multiple users and applications to control both capital and operational costs. Yet there are no native ways to guarantee, control, or even gain visibility into workload-level performance within Hadoop. Even if there wasn’t a real high-end skills and deep expertise gap for most, there still isn’t any practical way that additional experts could tweak and tune mixed Hadoop workload environments to meet production performance SLA’s.
At the same time, the competitive game of mining of value from big data has moved from day-long batch ELT/ETL jobs feeding downstream BI systems, to more user interactive queries and business process “real time” applications. Live performance matters as much now in big data as it does in any other data center solution. Ensuring multi-tenant workload performance within Hadoop is why Pepperdata, a cluster performance optimization solution, is critical to the success of enterprise big data initiatives.
In this report we’ll look deeper into today’s Hadoop deployment challenges and learn how performance optimization capabilities are not only necessary for big data success in enterprise production environments, but can open up new opportunities to mine additional business value. We’ll look at Pepperdata’s unique performance solution that enables successful Hadoop adoption for the common enterprise. We’ll also examine how it inherently provides deep visibility and reporting into who is doing what/when for troubleshooting, chargeback and other management needs. Because Pepperdata’s function is essential and unique, not to mention its compelling net value, it should be a checklist item in any data center Hadoop implementation.
To read this full report please click here.
Virtualization has matured and become widely adopted in the enterprise market. HyperConverged Infrastructure (HCI), with virtualization at its core, is taking the market by storm, enabling virtualization for businesses of all sizes. The success of these technologies has been driven by an insatiable desire to make IT simpler, faster, and more efficient. IT can no longer afford the time and effort required to create custom infrastructure from best-of-breed DIY components.
With HCI, the traditional three-tier architecture has been collapsed into a single system that is purpose-built for virtualization. In these solutions, the hypervisor, compute, storage, and advanced data services are integrated into an x86 industry-standard building block. The immense success of this approach has led to increased competition in this space and the customers are required to sort through the various offerings, analyzing key attributes to determine which are significant.
One of these competing vendors, Pivot3, was founded in 2002 and has been in the HCI market since 2008, well before the term HyperConverged was used. For many years, Pivot3’s vSTAC architecture has provided the most efficient scale-out Software-Defined Storage (SDS) system available on the market. This efficiency is attributed to three design innovations. The first is their extremely efficient and reliable erasure coding technology called Scalar Erasure Coding. Conversely, many leading HCI implementations use replication-based redundancy techniques which are heavy on storage capacity utilization. Scalar Erasure Coding from Pivot3 can deliver significant capacity savings depending on the level of drive protection selected. The second innovation is Pivot3’s Global Hyperconvergence which creates a cross-cluster virtual SAN, the HyperSAN: in case of appliance failure, a VM migrates to another node and continues operations without the need to divert compute power to copy data over to that node. The third innovation has been a reduction in CPU overhead needed to implement the SDS features and other VM centric management tasks. Implementation of the HCI software uses the same CPU complex as business applications, this additional usage is referred to as the HCI overhead tax. HCI overhead tax is important since the licensing cost for many applications and infrastructure software are based on a per CPU basis. Even with today’s ever- increasing cores per CPU there still can be significant cost saving by keeping the HCI overhead tax low.
The Pivot3 family of HCI products delivering high data efficiency with a very low overhead are an ideal solution for storage-centric business workload environments where storage costs and reliability are critical success factors. One example of this is a VDI implementation where cost per seat determines success. Other examples would be capacity-centric workloads such as big data or video surveillance that could benefit from a Pivot3 HCI approach with leading storage capacity and reliability. In this paper we compare Pivot3 with other leading HCI architectures. We utilized data extracted from the alternative HCI vendor’s reference architectures for VDI implementations. Using real world examples, we have demonstrated that with other solutions, users must purchase up to 136% more raw storage capacity and up to 59% more total CPU cores than are required when using equivalent Pivot3 products. These impressive results can lead to significant costs savings.