Items Tagged: Data+Management
Actifio PAS 5.0, Shrinking BU Win & Unattainable SLAs? How About Instant Restores w/ Guaranteed SLAs
If you have ever been worried about shrinking backup windows and have not met SLAs, you should take a serious look at Actifio PAS 5.0 which provides instantaneous restores with guaranteed data recoverability and guaranteed SLAs.
- Premiered: 06/19/12
- Author: Taneja Group
The Power of Distributed Object Storage
Several years ago, Taneja Group predicted the inevitable emergence of what we called "cloud-based storage." Today, this technology is behind most cloud offerings for unstructured data, whether public or private. We defined cloud-based storage as the highly scalable, RESTful API accessible, object-based technology that is no longer just an Amazon S3 offering, but is served up by all manner of product vendors and providers.
- Premiered: 01/09/13
- Author: Taneja Group
- Published: InfoStor.com
Data Defined Storage: Building on the Benefits of Software Defined Storage
At its core, Software Defined Storage decouples storage management from the physical storage system. In practice Software Defined Storage vendors implement the solution using a variety of technologies: orchestration layers, virtual appliances and server-side products are all in the market now. They are valuable for storage administrators who struggle to manage multiple storage systems in the data center as well as remote data repositories.
What Software Defined Storage does not do is yield more value for the data under its control, or address global information governance requirements. To that end, Data Defined Storage yields the benefits of Software Defined Storage while also reducing data risk and increasing data value throughout the distributed data infrastructure. In this report we will explore how Tarmin’s GridBank Data Management Platform provides Software Defined Storage benefits and also drives reduced risk and added business value for distributed unstructured data with Data Defined Storage.
Converging Branch IT Infrastructure the Right Way: Riverbed SteelFusion
Companies with significant non-data center and often widely distributed IT infrastructure requirements are faced with many challenges. It can be difficult enough to manage tens or hundreds if not thousands of remote or branch office locations, but many of these can also be located in dirty or dangerous environments that are simply not suited for standard data center infrastructure. It is also hard if not impossible to forward deploy the necessary IT experience to manage any locally placed resources. The key challenge then, and one that can be competitively differentiating on cost alone, is to simplify branch IT as much as possible while supporting branch business.
Converged solutions have become widely popular in the data center, particularly in virtualized environments. By tightly integrating multiple functionalities into one package, there are fewer separate moving parts for IT to manage while at the same time optimizing capabilities based on tightly intimately integrating components. IT becomes more efficient and in many ways gains more control over the whole environment. In addition to obviously increasing IT simplicity there are many other cascading benefits. The converged infrastructure can perform better, is more resilient and available, and offers better security than separately assembled silos of components. And a big benefit is a drastically lowered TCO.
Yet for a number of reasons, data center convergence approaches haven’t translated as usefully to beneficial convergence in the branch. No matter how tightly integrated a “branch in a box” is, if it’s just an assemblage of the usual storage, server, and networking silo components it will still suffer from traditional branch infrastructure challenges – second-class performance, low reliability, high OPEX, and difficult to protect and recover. Branches have unique needs and data center infrastructure, converged or otherwise, isn’t designed to meet those needs. This is where Riverbed has pioneered a truly innovative converged infrastructure designed explicitly for the branch which provides simplified deployment and provisioning, resiliency in the face of network issues, improved protection and recovery from the central data center, optimization and acceleration for remote performance, and a greatly lowered OPEX.
In this paper we will review Riverbed’s SteelFusion (formerly known as Granite) branch converged infrastructure solution, and see how it marries together multiple technical advances including WAN optimization, stateless compute, and “projected” datacenter storage to solve those branch challenges and bring the benefits of convergence out to branch IT. We’ll see how SteelFusion is not only fulfilling the promise of a converged “branch” infrastructure that supports distributed IT, but also accelerates the business based on it.
Why Facebook and the NSA love graph databases
Is there a benefit to understanding how your users, suppliers or employees relate to and influence one another? It's hard to imagine that there is a business that couldn't benefit from more detailed insight and analysis, let alone prediction, of its significant relationships.
- Premiered: 06/17/14
- Author: Mike Matchett
- Published: Tech Target: Search Data Center
New choices bring enterprise big data home
Enterprises recognize the tantalizing value of big data analytics, but traditional concerns about data management and security have held back deployments -- until now.
- Premiered: 12/02/14
- Author: Mike Matchett
- Published: Tech Target: Search Data Center
Manage Unstructured Data: File Analysis at Scale
Active data is relatively visible and manageable on production systems but the longer data ages, the less visible it becomes to users and processes.
- Premiered: 04/22/15
- Author: Christine Taylor
- Published: Infostor
Expert Video: Copy data management methods for storage
On average, how many copies of data get made between production and test/dev? Probably more than you think, according to Mike Matchett.
- Premiered: 06/29/15
- Author: Taneja Group
- Published: TechTarget: Search Storage
Hyper-converged vendors focus on data protection
Hyper-convergence has impacted primary storage, but Arun Taneja says hyper-converged vendors are bringing the concept to data protection.
- Premiered: 09/03/15
- Author: Arun Taneja
- Published: TechTarget: Search Data Backup
Cohesity Launches the Industry's First Secondary Storage Solution to Unify Fragmented Data Landscape
Cohesity, the pioneer of converged secondary storage, today announced the public launch of the Cohesity Data Platform, the first product designed to consolidate all secondary storage use cases on a unified environment that helps organizations control growing data demands.
- Premiered: 10/14/15
- Author: Taneja Group
- Published: Virtual-Strategy Magazine
Hyperscale Cloud Vendors vs. Cloud-Enabled Data Protection Vendors
Cost alone should not be the deciding factor when deciding between a hyperscale cloud vendor or a cloud-enabled data protection provider.
- Premiered: 11/06/15
- Author: Jim Whalen
- Published: Datamation
Array Efficient, VM-Centric Data Protection: HPE Data Protector and 3PAR StoreServ
One of the biggest storage trends we are seeing in our current research here at Taneja Group is that of storage buyers (and operators) looking for more functionality – and at the same time increased simplicity – from their storage infrastructure. For this and many other reasons, including TCO (both CAPEX and OPEX) and improved service delivery, functional “convergence” is currently a big IT theme. In storage we see IT folks wanting to eliminate excessive layers in their complex stacks of hardware and software that were historically needed to accomplish common tasks. Perhaps the biggest, most critical, and unfortunately onerous and unnecessarily complex task that enterprise storage folks have had to face is that of backup and recovery. As a key trusted vendor of both data protection and storage solutions, we note that HPE continues to invest in producing better solutions in this space.
HPE has diligently been working towards integrating data protection functionality natively within their enterprise storage solutions starting with the highly capable tier-1 3PAR StoreServ arrays. This isn’t to say that the storage array now turns into a single autonomous unit, becoming a chokepoint or critical point of failure, but rather that it becomes capable of directly providing key data services to downstream storage clients while being directed and optimized by intelligent management (which often has a system-wide or larger perspective). This approach removes excess layers of 3rd party products and the inefficient indirect data flows traditionally needed to provide, assure, and then accelerate comprehensive data protection schemes. Ultimately this evolution creates a type of “software-defined data protection” in which the controlling backup and recovery software, in this case HPE’s industry-leading Data Protector, directly manages application-centric array-efficient snapshots.
In this report we examine this disruptively simple approach and how HPE extends it to the virtual environment – converging backup capabilities between Data Protector and 3PAR StoreServ to provide hardware assisted agentless backup and recovery for virtual machines. With HPE’s approach, offloading VM-centric snapshots to the array while continuing to rely on the hypervisor to coordinate the physical resources of virtual machines, virtualized organizations gain on many fronts including greater backup efficiency, reduced OPEX, greater data protection coverage, immediate and fine-grained recovery, and ultimately a more resilient enterprise. We’ll also look at why HPE is in a unique position to offer this kind of “converging” market leadership, with a complete end-to-end solution stack including innovative research and development, sales, support, and professional services.
Quantum bringing public cloud into virtual storage fold
Data management specialist Quantum wants to make it easier for enterprises to combine two of the major trends in storage: virtualization and public clouds.
- Premiered: 01/26/16
- Author: Taneja Group
- Published: PC World
Veritas Embarks on Journey to Transform Cloud Data Management
If you have lots of data, and what enterprise today doesn’t have a massive amount of data, you need to store the data, protect the data, enable access to the data, and more and more you are probably moving at least some of the data to the cloud.
- Premiered: 09/16/16
- Author: Steve Ricketts
Kinetica Unveils GPU-accelerated Database for Analyzing Streaming Data with Enhanced Performance
Kinetica today announced the newest release of its distributed, in-memory database accelerated by GPUs that simultaneously ingests, explores, and visualizes streaming data.
- Premiered: 09/21/16
- Author: Taneja Group
- Published: Business Wire
Startup Datera stretches Elastic Data Fabric with all-flash nodes
Datera Elastic Data Fabric qualifies Dell PowerEdge servers, adds all-flash nodes, and increases scaling to 50 nodes and 5 PB raw capacity per cluster.
- Premiered: 09/29/16
- Author: Taneja Group
- Published: TechTarget: Search Cloud Storage
Apache Spark Survey Reveals Increased Growth in Users
In order to better understand Apache Spark’s growing role in big data, Taneja Group conducted a major market research project, surveying approximately 7,000 people.
- Premiered: 11/08/16
- Author: Taneja Group
- Published: Satellite Press Releases
Datrium's Optimized Platform for Virtualized IT: "Open Convergence" Challenges HyperConvergence
The storage market is truly changing for the better with new storage architectures finally breaking the rusty chains long imposed on IT by traditional monolithic arrays. Vast increases in CPU power found in newer generations of servers (and supported by ever faster networks) have now freed key storage functionality to run wherever it can best serve applications. This freedom has led to the rise of all software-defined storage (SDS) solutions that power modular HyperConverged infrastructure (HCI). At the same time, increasingly affordable flash resources have enabled all-flash array options that promise both OPEX simplification and inherent performance gains. Now, we see a further evolution of storage that intelligently converges performance-oriented storage functions on each server while avoiding major problems with HyperConverged “single appliance” adoption.
Given the market demand for better, more efficient storage solutions, especially those capable of large scale, low latency and mixed use, we are seeing a new generation of vendors like Datrium emerge. Datrium studied the key benefits that hyperconvergence previously brought to market including the leverage of server-side flash for cost-effective IO performance, but wanted to avoid the all-in transition and the risky “monoculture” that can result from vendor-specific HCI. Their resulting design runs compute-intensive IO tasks scaled-out on each local application server (similar to parts of SDS), but persists and fully protects data on cost-efficient, persistent shared storage capacity. We have come to refer to this optimizing tiered design approach as “Server Powered Storage” (SPS), indicating that it can take advantage of the best of both shared and server-side resources.
Ultimately this results in an “Open Convergence” approach that helps virtualized IT environments transition off of aging storage arrays in an easier, flexible and more natural adoption path than with a fork-lift HyperConvergence migration. In this report we will briefly review the challenges and benefits of traditional convergence with SANs, the rise of SDS and HCI appliances, and now this newer “open convergence” SPS approach as pioneered by Datrium DVX. In particular, we’ll review how Datrium offers benefits ranging from elastic performance, greater efficiency (with independent scaling of performance vs. capacity), VM-centric management, enterprise scalability and mixed workload support while still delivering on enterprise requirements for data resiliency and availability.
DATA Challenges in Virtualized Environments
Virtualized environments present a number of unique challenges for user data. In physical server environments, islands of storage were mapped uniquely to server hosts. While at scale that becomes expensive, isolating resources and requiring a lot of configuration management (all reasons to virtualize servers), this at least provided directly mapped relationships to follow when troubleshooting, scaling capacity, handling IO growth or addressing performance.
However, in the virtual server environment, the layers of virtual abstraction that help pool and share real resources also obfuscate and “mix up” where IO actually originates or flows, making it difficult to understand who is doing what. Worse, the hypervisor platform aggregates IO from different workloads hindering optimization and preventing prioritization. Hypervisors also tend to dynamically move virtual machines around a cluster to load balance servers. Fundamentally, server virtualization makes it hard to meet application storage requirements with traditional storage approaches.
Current Virtualization Data Management Landscape
Let’s briefly review the three current trends in virtualization infrastructure used to ramp up data services to serve demanding and increasingly larger scale clusters:
- Converged Infrastructure - with hybrid/All-Flash Arrays (AFA)
- HyperConverged Infrastructure - with Software Defined Storage (SDS)
- Open Converged Infrastructure - with Server Powered Storage (SPS)
Converged Infrastructure - Hybrid and All-Flash Storage Arrays (AFA)
We first note that converged infrastructure solutions simply pre-package and rack traditional arrays with traditional virtualization cluster hosts. The traditional SAN provides well-proven and trusted enterprise storage. The primary added value of converged solutions is in a faster time-to-deploy for a new cluster or application. However, ongoing storage challenges and pain points remain the same as in un-converged clusters (despite claims of converged management as these tend to just aggregate dashboards into a single view).
The traditional array provides shared storage from which virtual machines draw for both images and data, either across Fibre Channel or IP network (NAS or iSCSI). While many SAN’s in the hands of an experienced storage admin can be highly configurable, they do require specific expertise to administer. Almost every traditional array has by now become effectively hybrid, capable of hosting various amounts of flash, but if the array isn’t fully engineered for flash it is not going to be an optimal choice for an expensive flash investment. Hybrid arrays can offer good performance for the portion of IO that receives flash acceleration, but network latencies are far larger than most gains. Worse, it is impossible for a remote SAN to know which IO coming from a virtualized host should be cached/prioritized (in flash)– it all looks the same and is blended together by the time it hits the array.
Some organizations deploy even more costly all-flash arrays, which can guarantee array-side performance for all IO and promise to simplify administration overhead. For a single key workload, a dedicated AFA can deliver great performance. However, we note that virtual clusters mostly host mixed workloads, many of which don’t or won’t benefit from the expensive cost of persisting all data on all flash array storage. Bottomline - from a financial perspective, SAN flash is always more expensive than server-side flash. And by placing flash remote across a network in the SAN, there is always a relatively large network latency which denigrates the benefit of that array side flash investment.
HyperConverged Infrastructures - Software Defined Storage (SDS)
As faster resources like flash, especially added to servers directly, came down in price, so-called Software Defined Storage (SDS) options proliferated. Because CPU power has continuously grown faster and denser over the years, many traditional arrays came to be actually built on plain servers running custom storage operating systems. The resulting storage “software” often now is packaged as a more cost-effective “software-defined” solution that can be run or converged directly on servers (although we note most IT shops prefer buying ready-to-run solutions, not software requiring on-site integration).
In most cases software-defined storage runs within virtual machines or containers such that storage services can be hosted on the same servers as compute workloads (e.g. VMware VSAN). An IO hungry application accessing local storage services can get excellent IO service (i.e. no network latency), but capacity planning and performance tuning in these co-hosted infrastructures can be exceedingly difficult. Acceptable solutions must provide tremendous insight or complex QoS facilities that can dynamically shift IO acceleration with workloads as they might move across a cluster (eg. to keep data access local). Additionally, there is often a huge increase in East-West traffic between servers.
Software Defined Storage enabled a new kind of HyperConverged Infrastructure (HCI). Hyperconvergence vendors produce modular appliances in which a hypervisor (or container management), networking and (software-defined) storage all are pre-integrated to run within the same server. Because of vendor-specific storage, network, and compute integration, HCI solutions can offer uniquely optimized IO paths with plug-and-play scalability for certain types of workloads (e.g. VDI).
For highly virtualized IT shops, HCI simplifies many infrastructure admin responsibilities. But HCI presents new challenges too, not least among them is that migration to HCI requires a complete forklift turnover of all infrastructure. Converting all of your IT infrastructure to a unique vendor appliance creates a “full stack” single vendor lock-in issue (and increased risk due to lowered infrastructure “diversity”).
As server-side flash is cheaper than other flash deployment options, and servers themselves are commodity resources, HCI does help optimize the total return on infrastructure CAPEX – especially as compared to traditional silo’d server and SAN architectures. But because of the locked-down vendor appliance modularity, it can be difficult to scale storage independently from compute when needed (or even just storage performance from storage capacity). Obviously, pre-configured HCI vendor SKU’s also preclude using existing hardware or taking advantage of blade-type solutions.
With HCI, every node is also a storage node which at scale can have big impacts on software licensing (e.g. if you need to add nodes just for capacity, you will also pay for compute licenses), overbearing “East-West” network traffic, and in some cases unacceptable data availability risks (e.g. when servers lock/crash/reboot for any reason, an HCI replication/rebuild can be a highly vulnerable window).
OPEN Converged Infrastructure - Server Powered Storage (SPS)
When it comes to performance, IO still may need to transit a network incurring a latency penalty. To help, there are several third party vendors of IO caching that can be layered in the IO path – integrated with the server or hypervisor driver stack or even placed in the network. These caching solutions take advantage of server memory or flash to help accelerate IO. However, layering in yet another vendor and product into the IO path incurs additional cost, and also complicates the end-to-end IO visibility. Multiple layers of caches (vm, hypervisor, server, network, storage) can disguise a multitude of ultimately degrading performance issues.
Ideally, end-to-end IO, from within each local server to shared capacity, should all fall into a single converged storage solution – one that is focused on providing the best IO service by distributing and coordinating storage functionality where it best serves the IO consuming applications. It should also optimize IT’s governance, cost, and data protection requirements. Some HCI solutions might claim this in total, but only by converging everything into a single vendor appliance. But what if you want a easier solution capable of simply replace aging arrays in your existing virtualized environments – especially enabling scalability in multiple directions at different times and delivering extremely low latency while still supporting a complex mix of diverse workloads?
This is where we’d look to a Server Powered Storage (SPS) design. For example, Datrium DVX still protects data with cost-efficient shared data servers on the back-end for enterprise quality data protection, yet all the compute-intensive, performance-impacting functionality is “pushed” up into each server to provide local, accelerated IO. As Datrium’s design leverages each application server instead of requiring dedicated storage controllers, the cost of Datrium compared to traditional arrays is quite favorable, and the performance is even better than (and as scalable as) a 3rd party cache layered over a remote SAN.
In the resulting Datrium “open converged” infrastructure stack, all IO is deduped and compressed (and locally served) server-side to optimize storage resources and IO performance, while management of storage is fully VM-centric (no LUN’s to manage). In this distributed, open and unlocked architecture, performance scales with each server added to naturally scale storage performance with application growth.
Datrium DVX makes great leverage for a given flash investment by using any “bring-your-own” SSDs, far cheaper to add than array-side flash (and can be added to specific servers as needed/desired). In fact, most vm’s and workloads won’t ever read from the shared capacity on the network – it is write-optimized persistent data protection and can be filled with cost-effective high-capacity drives.
Taneja Group Opinion
As just one of IT’s major concerns, all data bits must be persisted and fully managed and protected somewhere at the end of the day. Traditional arrays, converged or not, just don’t perform well in highly virtualized environments, and using SDS (powering HCI solutions) to farm all that critical data across fungible compute servers invokes some serious data protection challenges. It just makes sense to look for a solution that leverages the best aspects of both enterprise arrays (for data protection) and software/hyperconverged solutions (that localize data services for performance).
At the big picture level, Server Powered Storage can be seen as similar (although more cost-effective and performant) to a multi-vendor solution in which IT layers server-side IO acceleration functionality from one vendor over legacy or existing SANs from another vendor. But now we are seeing a convergence (yes, this is an overused word these days, but accurate here) of those IO path layers into a single vendor product. Of course, a single vendor solution that fully integrates distributed capabilities in one deployable solution will perform better and be naturally easier to manage and support (and likely cheaper).
There is no point in writing storage RFP’s today that get tangled up in terms like SDS or HCI. Ultimately the right answer for any scenario is to do what is best for applications and application owners while meeting IT responsibilities. For existing virtualization environments, new approaches like Server Powered Storage and Open Convergence offer considerable benefit in terms of performance and cost (both OPEX and CAPEX). We highly recommend that before one invests in expensive all-flash arrays, or takes on a full migration to HCI, that an Open Convergence option like Datrium DVX be considered as a potentially simpler, more cost-effective, and immediately rewarding solution.
NOTICE: The information and product recommendations made by the TANEJA GROUP are based upon public information and sources and may also include personal opinions both of the TANEJA GROUP and others, all of which we believe to be accurate and reliable. However, as market conditions change and not within our control, the information and recommendations are made without warranty of any kind. All product names used and mentioned herein are the trademarks of their respective owners. The TANEJA GROUP, Inc. assumes no responsibility or liability for any damages whatsoever (including incidental, consequential or otherwise), caused by your use of, or reliance upon, the information and recommendations presented herein, nor for any inadvertent errors that may appear in this document.
HPE pays $650 million for SimpliVity hyper-convergence
The long-awaited HPE-SimpliVity deal cost HPE $650 million for the hyper-converged pioneer. The buy gives HPE an installed base, as well as data reduction and protection features.
- Premiered: 01/18/17
- Author: Taneja Group
- Published: TechTarget: Search Converged Infrastructure
Internet of things data security proves vital in digitized world
Securing IoT data should become a priority as more companies manipulate the volumes produced by these devices. Seemingly innocuous information could allow privacy invasions.
- Premiered: 03/17/17
- Author: Mike Matchett
- Published: TechTarget: Search IT Operations