Join Newsletter
Trusted Business Advisors, Expert Technology Analysts

Taneja Blog

Taneja Blog / Big Data

Big Data Slowing You Down? Do It In Memory With ScaleOut Software hServer v2

In the run-up to Strata/Hadoop 2013 in NY coming up here at the end of October, we are hearing a lot about some exciting new ways to implement big data solutions. One of the most interesting is the recent release of ScaleOut's hServer V2, which evolves their high performance "in-memory data grid" (IMDG) to further support Hadoop workloads - gaining a reported 20x speedup on MapReduce jobs.

The first version of hServer basically let its IMDG serve as a fast data store, enabling more real-time needs to be met with Hadoop HDFS clients. This V2 version goes the next step by implementing a Hadoop API compatible MapReduce engine right inside the IMDG - really moving the compute to where the data is!

All the Hadoop ecosystem products can make use of hServer as a highly sped up core MR engine and data repository. We predict that many MR Hadoop applications seeking more real-time speeds might just now find it with hServer while still leveraging the same apps and skill sets, and not needing to step up to one of the emerging real time processing alternatives/add-on projects.

While this won't address every possible Hadoop application need, there are a myriad of ways to leverage hServer as it now processes MR jobs directly with lightening fast data access. For data too large to work on fully in-memory, hServer can still read from HDFS but leverage the IMDG for intermediate shuffle/reduce steps.

Because hServer is basically a pre-packaged Hadoop solution that computes MapReduce at high speeds, and is available in a free community edition good for up to 4 nodes and 256GB, ScaleOut foresees that hardcore Hadoop developers could start to adopt it as an agile development target. Host a subset of data into these "small" hServer free instances and then your MapReduce ( development/test cycles can be greatly accelerated - and by great we mean short enough cycles to develop incrementally like with a local Ruby or Python interactive scripting session.  Apps then aiming at PB scale can simply be retargeted to "normal" Hadoop production with a one-line change when ready (or of course, an enterprise license of hServer at whatever cluster scale is needed).

ScaleOut Software seems to have delivered an exceptionally optimized core Hadoop engine. It will be fun to see how quickly long-running Hadoop jobs speed up when re-targeted to hServer V2.  I can't help but wonder what these guys will do next for a V3.

Bookmark and Share
  • Premiered: 10/02/13
  • Author: Mike Matchett
Topic(s): Big Data ScaleOut hServer Hadoop


There are no comments to display. Scroll down to leave your own!


Leave a Comment

You must be logged in to comment. Click here to log in or register if you don't have an account.