Includes iSCSI, Fibre Channel or FC, InfiniBand or IB, SMI-S, RDMA over IP, FCoE, CEE, SAS, SCSI, NPIV, SSD.
All technologies relating to storage and servers are covered in this section. Taneja Group analysts have deep technology and business backgrounds and several have participated in the development of several of these technologies. We take pride in explaining complex technologies in a simple enough manner for IT, the press community and the industry at large to understand.
These days the world operates in real-time all the time. Whether making airline reservations or getting the best deal from an online retailer, information is expected to be up to date with the best information at your fingertips. Businesses are expected to meet this requirement, whether they sell products or services. Having real-time, actionable information can dictate whether a business survives or dies. In-memory databases have become popular in these environments. The world's 24X7 real-time demands cannot wait for legacy ERP and CRM application rewrites. Companies such as SAP devised ways to integrate disparate databases by building a single super-fast uber-database that could operate with legacy infrastructure while simultaneously creating a new environment where real-time analytics and applications can flourish. These capabilities enable businesses to succeed in the modern age, giving forward-thinking companies a real edge in innovation.
SAP HANA is an example of an application environment that uses in-memory database technology and allows the processing of massive amounts of real-time data in a short time. The in-memory computing engine allows HANA to process data stored in RAM as opposed to reading it from a disk. At the heart of SAP HANA is a database that operates on both OLAP and OLTP database workloads simultaneously. SAP HANA can be deployed on-premises or in the cloud. Originally, on-premises HANA was available only as a dedicated appliance. Recently SAP has expanded support to best in class components through their SAP Tailored Datacenter Integration (TDI) program. In this solution profile, Taneja Group examined the storage requirements needed for HANA TDI environments and evaluated storage alternatives including the HPE 3PAR StoreServ All Flash. We will make a strong case as to why all-flash arrays like the HPE 3PAR version are a great fit for SAP HANA solutions.
Why discuss storage for an in-memory database? The reason is simple: RAM loses its mind when the power goes off. This means that persistent shared storage is at the heart of the HANA architecture for scalability, disaster tolerance, and data protection. The performance attributes of your shared storage dictate how many nodes you can cluster into a SAP HANA environment which in turn affects your business outcomes. Greater scalability capability means more real-time information is processed. SAP HANA workload shared storage requirements are unique in being write intensive with low latency for small files and sequential throughput performance for large files. However, the overall storage capacity is not extreme which makes this workload an ideal fit for all-flash arrays that can meet performance requirements with the smallest quantity of SSDs. Typically you would need 10X the equivalent spinning media drives just to meet the performance requirements, which then leaves you with a massive amount of capacity that cannot be used for other purposes.
In this study, we examined five leading all-flash arrays including the HPE 3PAR StoreServ 8450 All Flash. We found that that the unique architecture of the 3PAR array could meet HANA workload requirements with up to 73% fewer SSDs, 76% less power, and 60% less rack space.
The storage market is truly changing for the better with new storage architectures finally breaking the rusty chains long imposed on IT by traditional monolithic arrays. Vast increases in CPU power found in newer generations of servers (and supported by ever faster networks) have now freed key storage functionality to run wherever it can best serve applications. This freedom has led to the rise of all software-defined storage (SDS) solutions that power modular HyperConverged infrastructure (HCI). At the same time, increasingly affordable flash resources have enabled all-flash array options that promise both OPEX simplification and inherent performance gains. Now, we see a further evolution of storage that intelligently converges performance-oriented storage functions on each server while avoiding major problems with HyperConverged “single appliance” adoption.
Given the market demand for better, more efficient storage solutions, especially those capable of large scale, low latency and mixed use, we are seeing a new generation of vendors like Datrium emerge. Datrium studied the key benefits that hyperconvergence previously brought to market including the leverage of server-side flash for cost-effective IO performance, but wanted to avoid the all-in transition and the risky “monoculture” that can result from vendor-specific HCI. Their resulting design runs compute-intensive IO tasks scaled-out on each local application server (similar to parts of SDS), but persists and fully protects data on cost-efficient, persistent shared storage capacity. We have come to refer to this optimizing tiered design approach as “Server Powered Storage” (SPS), indicating that it can take advantage of the best of both shared and server-side resources.
Ultimately this results in an “Open Convergence” approach that helps virtualized IT environments transition off of aging storage arrays in an easier, flexible and more natural adoption path than with a fork-lift HyperConvergence migration. In this report we will briefly review the challenges and benefits of traditional convergence with SANs, the rise of SDS and HCI appliances, and now this newer “open convergence” SPS approach as pioneered by Datrium DVX. In particular, we’ll review how Datrium offers benefits ranging from elastic performance, greater efficiency (with independent scaling of performance vs. capacity), VM-centric management, enterprise scalability and mixed workload support while still delivering on enterprise requirements for data resiliency and availability.
Apache Spark has quickly grown into one of the major big data ecosystem projects and shows no signs of slowing down. In fact, even though Spark is well connected within the broader Hadoop ecosystem, Spark adoption by itself has enough energy and momentum that it may very well become the center of its own emerging market category. In order to better understand Spark’s growing role in big data, Taneja Group conducted a major Spark market research project. We surveyed nearly seven thousand (6900+) qualified technical and managerial people working with big data from around the world to explore their experiences with and intentions for Spark adoption and deployment, their current perceptions of the Spark marketplace and of the future of Spark itself.
We found that across the broad range of industries, company sizes, and big data maturities represented in the survey, over one-half (54%) of respondents are already actively using Spark. Spark is proving invaluable as 64% of those currently using Spark plan to notably increase their usage within the next 12 months. And new Spark user adoption is clearly growing – 4 out of 10 of those who are already familiar with Spark but not yet using it plan to deploy Spark soon.
The top reported use cases globally for Spark include the expected Data Processing/Engineering/ETL (55%), followed by forward-looking data science applications like Real-Time Stream Processing (44%), Exploratory Data Science (33%), and Machine Learning (33%). The more traditional analytics applications like Customer Intelligence (31%) and BI/DW (29%) were close behind, and illustrate that Spark is capable of supporting many different kinds of organizational big data needs. The main reasons and drivers reported for adopting Spark over other solutions start with Performance (mentioned by 74%), followed by capabilities for Advanced Analytics (49%), Stream Processing (42%) and Ease of Programming (37%).
When it comes to choosing a source for Spark, more than 6 out of 10 Spark users in the survey have considered or evaluated Cloudera, nearly double the 35% that may have looked at the Apache Download or the 33% that considered Hortonworks. Interestingly, almost all (90+%) of those looking at Cloudera Spark adopted it for their most important use case, equating to 57% of those who evaluated Cloudera overall. Organizations cited quality of support (46%) as their most important selection factor, followed by demonstrated commitment to open source (29%), enterprise licensing costs (27%) and the availability of cloud support (also 27%).
Interestingly, while on-premise Spark deployments dominate today (more than 50%), there is a strong interest in transitioning many of those to cloud deployments going forward. Overall Spark deployment in public/private cloud (IaaS or PaaS) is projected to increase significantly from 23% today to 36%, along with a corresponding increase in using Spark SaaS, from 3% to 9%.
The biggest challenge with Spark, similar to what has been previously noted across the broader big data solutions space, is still reported by 6 out of 10 active users to be the big data skills/training gap within their organizations. Similarly, more than one-third mention complexity in learning/integrating Spark as a barrier to adoption. Despite these reservations, we note that compared to many previous big data analytics platforms, Spark today offers a higher—and often already familiar—level of interaction to users through its support of Python, R, SQL, notebooks, and seamless desktop-to-cluster operations, all of which no doubt contribute to its greatly increasing popularity and widespread adoption.
Overall, it’s clear that Spark has gained broad familiarity within the big data world and built significant momentum around adoption and deployment. The data highlights widespread current user success with Spark, validation of its reliability and usefulness to those who are considering adoption, and a growing set of use cases to which Spark can be successfully applied. Other big data solutions can offer some similar and overlapping capabilities (there is always something new just around the corner), but we believe that Spark, having already captured significant mindshare and proven real-world value, will continue to successfully expand on its own vortex of focus and energy for at least the next few years.
Today we are seeing big impacts on storage from the huge increase in the scale of an organization’s important data (e.g. Big Data, Internet Of Things) and the growing size of virtualization clusters (e.g. never-ending VM’s, VDI, cloud-building). In addition, virtualization adoption tends to increase the generalization of IT admins. In particular, IT groups are focusing more on servicing users and applications and no longer want to be just managing infrastructure for infrastructure’s sake. Everything that IT does is becoming interpreted, analyzed, and managed in application/business terms, including storage to optimize the return on their total IT investment. To move forward, an organization’s storage infrastructure not only needs to grow internally smarter, it also needs to become both VM and application aware.
While server virtualization made a lot of things better for the over-taxed IT shop, delivering quality storage services in hypervisor infrastructures with traditional storage created difficult challenges. In response Tintri pioneered per-VM storage infrastructure. The Tintri VMstore has eliminated multiple points of storage friction and pain. In fact, it’s now becoming a mandatory checkbox across the storage market for all arrays to claim some kind of VM-centricity. Unfortunately, traditional arrays are mainly focused on checking off rudimentary support for external hypervisor APIs that only serve to re-package the same old storage. The best fit to today’s (and tomorrow’s) virtual storage requirements will only come from fully engineered VM-centric storage and application-aware approaches as Tintri has done.
However, it’s not enough to simply drop in storage that automatically drives best practice policies and handles today’s needs. We all know change is constant, and key to preparing for both growth and change is having a detailed, properly focused view of today’s large scale environments, along with smart planning tools that help IT both optimize current resources and make the best IT investment decisions going forward. To meet those larger needs, Tintri has rolled out a Tintri Analytics SaaS-based offering that applies big data analytical power to the large scale of their customer’s VMstore VM-aware metrics.
In this report we will look briefly at Tintri’s overall “per-VM” storage approach and then take a deeper look at their new Tintri Analytics offering. The new Tintri Analytics management service further optimizes their app-aware VM storage with advanced VM-centric performance and capacity management. With this new service, Tintri is helping their customers receive greater visibility, insight and analysis over large, cloud-scale virtual operations. We’ll see how “big data” enhanced intelligence provides significant value and differentiation, and get a glimpse of the payback that a predictive approach provides both the virtual admin and application owners.
We are moving into a new era of data storage. The traditional storage infrastructure that we know (and do not necessarily love) was designed to process and store input from human beings. People input emails, word processing documents and spreadsheets. They created databases and recorded business transactions. Data was stored on tape, workstation hard drives, and over the LAN.
In the second stage of data storage development, humans still produced most content but there was more and more of it, and file sizes got larger and larger. Video and audio, digital imaging, websites streaming entertainment content to millions of users; and no end to data growth. Storage capacity grew to encompass large data volumes and flash became more common in hybrid and all-flash storage systems.
Today, the storage environment has undergone another major change. The major content producers are no longer people, but machines. Storing and processing machine data offers tremendous opportunities: Seismic and weather sensors that may lead to meaningful disaster warnings. Social network diagnostics that display hard evidence of terrorist activity. Connected cars that could slash automotive fatalities. Research breakthroughs around the human brain thanks to advances in microscopy.
However, building storage systems that can store raw machine data and process it is not for the faint of heart. The best solution today is massively scale-out, general purpose NAS. This type of storage system has a single namespace capable of storing billions of differently sized files, linearly scales performance and capacity, and offers data-awareness and real-time analytics using extended metadata.
There are a very few vendors in the world today who offer this solution. One of them is Qumulo. Qumulo’s mission is to provide high volume storage to business and scientific environments that produce massive volumes of machine data.
To gauge how well Qumulo works in the real world of big data, we spoke with six customers from life sciences, media and entertainment, telco/cable/satellite, higher education and the automotive industries. Each customer deals with massive machine-generated data and uses Qumulo to store, manage, and curate mission-critical data volumes 24x7. Customers cited five major benefits to Qumulo: massive scalability, high performance, data-awareness and analytics, extreme reliability, and top-flight customer support.
Read on to see how Qumulo supports large-scale data storage and processing in these mission-critical, intensive machine data environments.
The race is on at full speed. What race? The race to bring public cloud agility and economics to a data center near you. Ever since the first integrated systems came onto the scene in 2010, vendors have been furiously engineering solutions to make on-premises infrastructure as cost effective and as easy to use as the public cloud, while also providing the security, availability, and control that enterprises demand. Fundamentally, two main architectures have evolved within the race to modernize data centers that will create a foundation enabling fully private and hybrid clouds. The first approach uses traditional compute, storage, and networking infrastructure components (traditional 3-tier) overlaid with varying degrees of virtualization and management software. The second more recent approach is to build a fully virtualized data center using industry standard servers and networking and then layer on top of that a full suite of software-based compute, network, and storage virtualization with management software. This approach is often termed a Software-Defined Data Center (SDDC).
The goal of an SDDC is to extend virtualization techniques across the entire data center to enable the abstraction, pooling, and automation of all data center resources. This would allow a business to dynamically reallocate any part of the infrastructure for various workload requirements without forklifting hardware or rewiring. VMware has taken SDDC to a new level with VMware Cloud Foundation. VMware Cloud Foundation is the only unified SDDC platform for the hybrid cloud, which brings together VMware’s compute, storage, and network virtualization into a natively integrated stack that can be deployed on-premises or run as a service from the public cloud. It establishes a common cloud infrastructure foundation that gives customers a unified and consistent operational model across the private and public cloud.
VMware Cloud Foundation delivers an industry-leading SDDC cloud infrastructure by combining VMware’s highly scalable hyper-converged software (vSphere and VSAN) with the industry leading network virtualization platform, NSX. VMware Cloud Foundation comes with unique lifecycle management capabilities (SDDC Manager) that eliminate the overhead of system operations of the cloud infrastructure stack by automating day 0 to day 2 processes such as bring-up, configuration, workload provisioning, and patching/upgrades. As a result, customers can significantly shorten application time to market, boost cloud admin productivity, reduce risk, and lower TCO. Customers consume VMware Cloud Foundation software in three ways: factory pre-loaded on integrated systems (VxRack 1000 SDDC); deployed on top qualified Ready Nodes from HPE, QCT, Fujitsu, and others in the future, with qualified networking; and run as a service from the public cloud through IBM, vCAN partners, vCloud Air, and more to come.
In this comparative study, Taneja Group performed an in-depth analysis of VMware Cloud Foundation deployed on qualified Ready Nodes and qualified networking versus several traditional 3-tier converged infrastructure (CI) integrated systems and traditional 3-tier do-it-yourself (DIY) systems. We analyzed the capabilities and contrasted key functional differences driven by the various architectural approaches. In addition, we evaluated the key CapEx and OpEx TCO cost components. Taneja Group configured each traditional 3-tier system's hardware capacity to be as close as possible to the VMware Cloud Foundation qualified hardware capacity. Further, since none of the 3-tier systems had a fully integrated SDDC software stack, Taneja Group added the missing SDDC software, making it as close as possible to the VMware Cloud Foundation software stack. The quantitative comparative results from the traditional 3-tier DIY and CI systems were averaged together into one scenario because the hardware and software components are very similar.
Our analysis concluded that both types of solutions are more than capable of handling a variety of virtualized workload requirements. However, VMware Cloud Foundation has demonstrated a new level of ease-of-use due to its modular scale-out architecture, native integration, and automatic lifecycle management, giving it a strong value proposition when building out modern next generation data centers. The following are the five key attributes that stood out during the analysis:
- Native Integration of the SDDC: VMware Cloud Foundation natively integrates vSphere, Virtual SAN (VSAN), and NSX network virtualization.
- Simplest operational experience: VMware SDDC Manager automates the life-cycle of the SDDC stack including bring up, configuration, workload provisioning, and patches/upgrades.
- Isolated workload domains: VMware Cloud Foundation provides unique administrator tools to flexibly provision subsets of the infrastructure for multi-tenant isolation and security.
- Modular linear scalability: VMware Cloud Foundation employs an architecture in which capacity can be scaled by the HCI node, by the rack, or by multiple racks.
- Seamless Hybrid Cloud: Deploy VMware Cloud Foundation for private cloud and consume on public clouds to create a seamless hybrid cloud with a consistent operational experience.
Taneja Group’s in-depth analysis indicates that VMware Cloud Foundation will enable enterprises to achieve significant cost savings. Hyper-converged infrastructure, used by many web-scale service providers, with natively integrated SDDC software significantly reduced server, storage, and networking costs. This hardware cost saving more than offset the incremental SDDC software costs needed to deliver the storage and networking capability that typically is provided in hardware from best of breed traditional 3-tier components. In this study, we measured the upfront CapEx and 3 years of support costs for the hardware and software components needed to build out a VMware Cloud Foundation private cloud on qualified Ready Nodes. In addition, Taneja Group validated a model that demonstrates the labor and time OpEx savings that can be achieved through the use of integrated end-to-end automatic lifecycle management in the VMware SDDC Manager software.
By investing in VMware Cloud Foundation, businesses can be assured that their data center infrastructure can be easily consumed, scaled, managed, upgraded and enhanced to provide the best private cloud at the lowest cost. Using a pre-engineered modular, scale-out approach to building at web-scale means infrastructure is added in hours, not days, and businesses can be assured that adding infrastructure scales linearly without complexity. VMware Cloud Foundation is the only platform that provides a natively integrated unified SDDC platform for the hybrid cloud with end-to-end management and with the flexibility to provision a wide variety of workloads at the push of a button.
In summary, VMware Cloud Foundation enables at least five unparalleled capabilities, generates a 45% lower 3-year TCO than the alternative traditional 3-tier approaches, and delivers a tremendous value proposition when building out a modern hybrid SDDC platform. Before blindly going down the traditional infrastructure approach, companies should take a close look at VMware Cloud Foundation, a unified SDDC platform for the hybrid cloud.