Join Newsletter
Forgot
password?
Register
Trusted Business Advisors, Expert Technology Analysts

Big Data

Taneja Blog / Systems and Technology / Big Data

HPE Welcomes You To The Machine!

HPE has publicly rolled out their “The Machine” prototype, carrying 160TB of fabric-attached memory, 1280 ARM cores and 100Gb/s fiber interconnects.  Ok, so this is a whole lot of memory!  But it’s not just about memory.

In both HPC and big data analytics, and in increasingly converged applications that combine analytics with operational processes at scale, the game is all about increasing data locality to compute. Ten years ago we Hadoop unlocked massive data scale processing for certain classes of problems by “mapping/reducing” compute and big data across a cluster of commodity servers.  We might look at that as the “airy” and “cloudy” kind of approach.  Then Spark came about and showed how we really need to tackle big data sets in memory, though still across a cluster architecture.

Here today we see the bleeding edge of what aggressive hardware engineering can do to practically cram massive memory and large numbers of cores all together - compressing and converging compute and big data as densely as possible - in some ways a throwback nod to the old mainframe. Folks with highly-interconnected HPC-style modeling and simulation application needs that haven’t been well served by commodity scale-out (i.e. affordable) big data analytic architectures are going to want to look closely at this development. (and in fact, HPE has modified a Spark version to make different internal assumptions to match this new architecture to great affect (15x at least)).....

Read More
  • Premiered: 05/16/17
  • Author: Mike Matchett
Topic(s): HPE The Machine Big Data Spark IoT Mike Matchett
Taneja Blog / Systems and Technology / Big Data

The New Big Thing in Big Data: Results From Our Apache Spark Survey

In the last few months I’ve been really bullish on Apache Spark as an big enabler of wider big data solution adoption. Recently we got the great opportunity to conduct some deep Spark market research (with Cloudera’s sponsorship) and were able to survey nearly seven thousand (6900+) highly qualified technical and managerial people working with big data from around the world. ... Some highlights—First, across the broad range of industries, company sizes, and big data maturities, over one-half (54%) of respondents are already actively using Spark to solve a primary organizational use case. That’s an incredible adoption rate….

Read More
  • Premiered: 12/14/16
  • Author: Mike Matchett
Topic(s): Apache Spark Big Data Cloudera
Taneja Blog / Systems and Technology / Big Data

Looking For a Next Gen Data Center Platform? MapR Wants To Talk.

It’s almost 2017 - where is your organization with regards to getting any value out of Big Data projects?  Are you still dabbling with endless POC’s, or maybe haven’t even gotten a big data project approved yet?  I just got back from MapR’s first ever analyst conference. More on MapR in a moment, but first let me tell you about one of the interesting big data market adoption perspectives I heard there. One of their partners claimed that a big roadblock to wider big data adoption was senior IT decision-makers holding out investment or project approval until they could be assured that any resulting application would be able to be put into actual enterprise production, regardless of business value….

Read More
  • Premiered: 12/14/16
  • Author: Mike Matchett
Topic(s): Big Data MapR Converged Analytics
Taneja Blog / Virtualization / Systems and Technology / Big Data

Unifying Big Data Through Virtualized Data Services - Iguaz.io Rewrites the Storage Stack

One of the more interesting new companies to arrive on the big data storage scene is iguaz.io. The iguaz.io team has designed a whole new, purpose-built storage stack that can store and serve the same master data in multiple formats, at high performance and in parallel streaming speeds to multiple different kinds of big data applications. This promises to obliterate the current spaghetti data flows with many moving parts, numerous transformation and copy steps, and Frankenstein architectures required to currently stitch together increasingly complex big data workflows. We’ve seen enterprises need to build environments that commonly span from streaming ingest and real time processing through interactive query and into larger data lake and historical archive based analysis, and end up making multiple data copies in multiple storage formats in multiple storage services.

Read More
  • Premiered: 06/14/16
  • Author: Mike Matchett
Topic(s): iguaz.io
Taneja Blog / Systems and Technology / Big Data

Agile Big Data Clusters: DriveScale Enables Bare Metal Cloud

We’ve been writing recently about the hot, potentially inevitable, trend, towards a dense IT infrastructure in which components like CPU cores and disks are not only commoditized, but deployed in massive stacks or pools (with fast matrixing switches between them). Then a layered provisioning solution can dynamically compose any desired “physical” server or cluster out of those components. Conceptually this becomes the foundation for a bare-metal cloud. DriveScale today announces their agile architecture with this approach, aimed first at solving big data multi-cluster operational challenges.

Read More
  • Premiered: 05/19/16
  • Author: Mike Matchett
Topic(s): DriveScale Big Data Composable Cloud
Taneja Blog / Systems and Technology / Big Data

Scaling All Flash to New Heights - DDN Flashscale All Flash Array Brings HPC to the Data Center

It’s time to start thinking about massive amounts of flash in the enterprise data center. I mean PBs of flash for the biggest, baddest, fastest data-driven applications out there. This amount of flash requires an HPC-capable storage solution brought down and packaged for enterprise IT management. Which is where Data Domain Networks (aka DDN) is stepping up. Perhaps too quietly, they have been hard at work pivoting their high-end HPC portfolio into the enterprise space. Today they are rolling out a massively scalable new flash-centric Flashscale 14KXi storage array that will help them offer complete, comprehensive single-vendor big data workflow solutions - from the fastest scratch through the biggest throughput parallel file systems into the largest distributed object storage archives.

Read More
  • Premiered: 05/17/16
  • Author: Mike Matchett
Topic(s): DDN HPC Big Data SSD Flash SAN Lustre GPFS
Page 1 of 8 pages  1 2 3 >  Last ›