Unifying Big Data Through Virtualized Data Services - Iguaz.io Rewrites the Storage Stack
One of the more interesting new companies to arrive on the big data storage scene is iguaz.io. The iguaz.io team has designed a whole new, purpose-built storage stack that can store and serve the same master data in multiple formats, at high performance and in parallel streaming speeds to multiple different kinds of big data applications. This promises to obliterate the current spaghetti data flows with many moving parts, numerous transformation and copy steps, and Frankenstein architectures required to currently stitch together increasingly complex big data workflows. We've seen enterprises need to build environments that commonly span from streaming ingest and real time processing through interactive query and into larger data lake and historical archive based analysis, and end up making multiple data copies in multiple storage formats in multiple storage services.
This new unified virtualized storage service has three fundamental layers. The top virtualizing layer is made up of essentially containerized microservice (stateless) API translators that provide file, object, stream, and various NoSQL interfaces to storage clients. This is a great application of microservice architecture - scalable, fungible, and non-blocking.
The second layer in iquaz.io's stack is their "data container" based core data service engine. Here all data is stream-processed - indexed, compressed, buffered in memory (and/or NVMe) for protection/speed, and managed/tiered to the third storage "media" persistence layer. Security, QoS and other storage functions are "inserted" into the data services pipeline at this core level as well. The resulting architecture looks like a pipeline for assembling (or serving) data objects to/from media from/to the API users.
The media layer can be assembled from file, block, object and even cloud storage. Natively, iguaz.io would find high affinity with high performance object "oriented" storage, as its data service engine is mapping all other kinds of end user visibible data formats into internal "data objects". Future hard drives with native low-level key/value object services are going to be very interesting, but we expect any storage that can be leveraged today could be turned into a big data storage server.
What's not obvious from a few sentences here is just how simplifying this can make a multi-stage big data workflow, and the kinds of TCO (capex and opex) that can be saved, not just the acceleration in time to value, increased agility, and unlocked opportunities with big data this can provide. With IoT coming on strong, this iguaz.io may be unlocking the enterprise answer to big data storage.
- Premiered: 06/14/16
- Author: Mike Matchett
There are no comments to display. Scroll down to leave your own!