Taneja Group | Hadoop
Join Newsletter
Trusted Business Advisors, Expert Technology Analysts

Items Tagged: Hadoop

news / Blog

Amazon Defines Big Data As Big Opportunity

According to Amazon, size doesn't really matter in their definition of Big Data. Instead, it's more about the threshold where distributed processing solutions like Elastic Map Reduce start providing cost-effective development and operations.

  • Premiered: 04/27/12
  • Author: Mike Matchett
Topic(s): Big Data Cloud Amazon Hadoop
news / Blog

Cleversafe and Hadoop

Cleversafe is announcing a platform integration with Hadoop as part of their upcoming Cleversafe 3.0 release scheduled for later this year.

  • Premiered: 07/10/12
  • Author: Taneja Group
Topic(s): Hadoop Cleversafe Big Data

Symantec tackles Hadoop storage, 'big data' analytics

Symantec Corp. today announced an Apache Hadoop add-on capability for its Veritas Cluster File System to help run "big data" analytics on storage area networks instead of scale-out, commodity servers using local storage.

  • Premiered: 08/13/12
  • Author: Taneja Group
  • Published: TechTarget: SearchStorage.com
Topic(s): TBA Hadoop TBA Big Data TBA Mike Matchett TBA SearchStorage TBA TechTarget TBA Symantec TBA analytics
news / Blog

Daegis Acumen for Technology Assisted Review on Big Data

Daegis Acumen, part of the Daegis eDiscovery Platform, offers a hosted solution that handles big data with Hadoop-based clustering and Probabilistic Latent Semantic indexing.

  • Premiered: 10/19/12
  • Author: Taneja Group
Topic(s): Daegis predictive coding predictive technology Hadoop eDiscovery review
news / Blog

Enterprise IT Will Dive Into Big Data Solutions in 2013

If you are in IT, 2013 is going to be the year that you will want to dive into the "big data" pool if you haven't been pushed in already. But don't worry - it's no longer sink or swim. For one, we'll be here to help coach IT folks through it all. And while the concepts, terminology and hype have been all over the place, once you start floating around you'll find that under the surface much of what fills the big data pool is familiar IT infrastructure, data management, and services re-cast around a few easy-to-grasp innovations. For example, if you are in IT and asked to pick a Hadoop distro to stand up, you'd probably start with evaluating the three main distributions of Hadoop (other than getting it straight off Apache) followed by other downstream OEM'd and pre-integrated versions. The main distros are from Cloudera, Hortonworks, and MapR. I didn't really appreciate the differences until talking with all three individually (at 2012 NY Strata, see below).

  • Premiered: 01/15/13
  • Author: Mike Matchett
Topic(s): Big Data Hadoop Cloudera MapR Hortonworks Dell EMC strata Apache
news / Blog

Big Data Appliance Wrapped Up for the Enterprise

Here we are in Santa Clara eagerly awaiting Strata.... Hadoop's R+D infant years are passing, and now it is of the age where vendors are truly adding value for the enterprise IT shop. Clearly the theme is to wrap up low level complexities into higher value solutions. One standout announcement is DDN's hScaler appliance - a monster of a Hadoop machine.

  • Premiered: 02/25/13
  • Author: Mike Matchett
Topic(s): DDN Big Data Hadoop
news / Blog

Is Hadoop the New Data Center Platform for All Data?

This morning we were able to attend EMC Greenplum's launch their new Hadoop distro called Pivotal HD. Core to this distro is HAWQ, their new massively parallel processing analytical database built with Hadoop at its heart... consider that horizontal multi-PB scale-out, business class interactive performance, and high-end easily leveraged analytics are now available in one package from a trusted enterprise vendor. This is fully SQL compliant analytical database stuff...

  • Premiered: 02/25/13
  • Author: Mike Matchett
Topic(s): EMC Pivotal HD Hadoop

Big data projects require big changes in hardware and software

IT pros called in on big data projects are finding that the typical approach doesn’t play nice on enterprise-grade virtualized infrastructure.

  • Premiered: 03/12/13
  • Author: Taneja Group
  • Published: TechTarget: SearchDataCenter.com
Topic(s): TBA Big Data TBA Hadoop TBA Jeff Boles TBA Virtualization
news / Blog

Nothing's Too Fast for Operational Intelligence - ScaleOut Software's hServer

There are a lot of HPC technologies coming soon to a data center near you! The latest offering from ScaleOut Software, known for their in-memory data grid solutions, is a customized in-memory data grid for Hadoop. This enables a blistering fast big data style real-time analysis of dynamically changing data. Solutions that use this are processing live operational data into actionable intelligence - financials, reservation systems, live customer experience,...

  • Premiered: 04/16/13
  • Author: Mike Matchett
Topic(s): ScaleOut Software Hadoop Big Data HDFS

Extreme Applications in the Enterprise Drive Parallel File System Adoption

With the advent of big data and cloud-scale delivery, companies are racing to deploy cutting-edge services that include “extreme” applications like massive voice and image processing or complex fi-nancial analysis modeling that can push storage systems to their limits. Examples of some high visi-bility and big market impacting solutions include applications based on image pattern recognition at large scale and financial risk management based on decision-making at high speed.

These ground-breaking solutions, made up of very different activities but with similar data storage challenges, create incredible new lines of business representing significant revenue potential. Every day here at Taneja Group we see more and more mainstream enterprises exploring similar “extreme service” opportunities. But when enterprise IT data centers take stock of what it is required to host and deliver these new services, it quickly becomes apparent that traditional clustered and even scale-out file systems - of the kind that most enterprise data centers (or cloud providers) have racks and racks of - simply can’t handle the performance requirements.

There are already great enterprise storage solutions for applications that need either raw throughput, high capacity, parallel access, low latency, or high availability – maybe even for two or three of those at a time. But when an “extreme” application needs all of those requirements at the same time, only supercomputing type storage in the form of parallel file systems provides a functional solution. The problem is that most commercial enterprises simply can’t afford or risk basing a line of business on an expensive research project.

The good news is that some storage vendors have been industrializing former supercomputing storage technologies, hardening massively parallel file systems into commercially viable solutions. This opens the door for revolutionary services creation, enabling mainstream enterprise datacenters to support the exploitation of new extreme applications.  

Publish date: 05/03/13

Big Virtualization: VMware is Virtualizing Hadoop

VMware today announced advancements that will allow vSphere to manage Hadoop clusters.

  • Premiered: 06/26/13
  • Author: Taneja Group
  • Published: NetworkWorld.com
Topic(s): TBA Hadoop TBA VMWare TBA vSphere TBA VMware vSphere TBA Big Data TBA Virtualization

Shoveling a big pile of Hadoop? - Virtualized Hadoop for IT

This 30 minute webcast will address the following:

-Why virtualize Hadoop? What are the benefits to IT and the user?

-Who are the players? VMware, Project Savanna (RedHat), Amazon EMR

-How does virtualizing Hadoop work technically with its scale-out computing and distributed storage models?

-What's the impact on performance?

-How virtualized Hadoop becomes a foundation of the datacenter as a unified platform for all kinds of workloads.
IT can now offer Big-Data-as-a-service.

About the speaker: Mike Matchett brings over 20 years experience in managing and marketing IT datacenter solutions particularly at the nexus of performance, capacity and virtualization. Currently he is focused on IT optimization for virtualization and convergence across servers, storage and networks, especially to handle the requirements of mission-critical applications, Big Data analysis, and the next generation data center. Mike has a deep understanding of systems management, IT operations, and solutions marketing to help drive architecture, messaging, and positioning initiatives. For more info visit: http://tanejagroup.com/about/who-we-are

  • Premiered: 07/30/13 at 1:30pm ET (10:30am PT)
  • Location: Live and OnDemand
  • Speaker(s): Mike Matchett
  • Sponsor(s): BrightTALK, Taneja Group
Topic(s): TBA Topic(s): BrightTALK Topic(s): TBA Topic(s): Hadoop Topic(s): TBA Topic(s): Virtualization Topic(s): TBA Topic(s): Big Data Topic(s): TBA Topic(s): Webinar Topic(s): TBA Topic(s): Webcast

Virtualizing Hadoop Impacts Big Data Storage

Hadoop is soon coming to enterprise IT in a big way. VMware’s new vSphere Big Data Extensions (BDE) commercializes its open source Project Serengeti to make it dead easy for enterprise admins to spin and up down virtual Hadoop clusters at will.

  • Premiered: 07/17/13
  • Author: Mike Matchett
  • Published: Enterprise Storage Forum
Topic(s): TBA Virtualization TBA VMWare TBA Hadoop TBA Project Serengeti TBA vSphere TBA Big Data TBA NAS TBA SAN TBA DAS TBA HDFS TBA HVE TBA Apache TBA scale-out TBA Hypervisor TBA EMC World 2013 TBA EMC World TBA virtualizing Hadoop TBA Project Savannah TBA OpenStack TBA KVM

Don't Miss These VMworld 2013 Sessions

With 358 sessions, time is money. Here are five sessions where your time will be well spent.

  • Premiered: 07/01/13
  • Author: Mike Matchett
  • Published: Virtualization Review
Topic(s): TBA VMWorld TBA VMworld 2013 TBA VDI TBA PCoiP TBA Virtualization TBA SDN TBA Network Virtualization TBA VMWare TBA VMware vSphere TBA ITaaS TBA Hybrid Cloud TBA Hadoop

Myths Surrounding Big Data Technology

Big data technology is a big deal for storage shops, and a clear understanding of what it means -- and doesn't mean -- is required to successfully configure storage for big data apps.

  • Premiered: 08/08/13
  • Author: Mike Matchett
  • Published: Tech Target: Search Storage
Topic(s): TBA Big Data TBA Storage TBA Cloudera TBA Apache TBA Hadoop TBA HDFS TBA MapR TBA NFS TBA CIFS TBA EMC TBA Isilon TBA DDN TBA DataDirect Networks TBA hScaler TBA Hortonworks

Market Landscape Abstract: Enterprise Hadoop Infrastructure for Big Data IT

Hadoop is coming to enterprise IT in a big way. The competitive advantage that can be gained from analyzing big data is just too “big” to ignore. And the amount of data available to crunch is only growing bigger, whether from new sensors, capture of people, systems and process “data exhaust”, or just longer retention of available raw or low-level details. It’s clear that enterprise IT practitioners everywhere are soon going to have to operate scale-out computing platforms in the production data center, and being the first, most mature solution on the scene, Hadoop is the likely target. The good news is that there is now a plethora of Hadoop infrastructure options to choose from to fit almost every practical big data need – the challenge now for IT is to implement the best solutions for their business client needs.

While Apache Hadoop as originally designed had a relatively narrow application for only certain kinds of batch-mode parallel algorithms applied over unstructured (or semi-structured depending on your definition) data, because of its widely available open source nature, commodity architecture approach, and ability to extract new kinds of value out of previously discarded or ignored data sets, the Hadoop ecosystem is rapidly evolving and expanding. With recent new capabilities like YARN that opens up the main execution platform to applications beyond batch MapReduce, the integration of structured data analysis, real-time streaming and query support, and the roll out of virtualized enterprise hosting options, Hadoop is quickly becoming a mainstream data processing platform.

There has been much talk that in order to derive top value from big data efforts, rare and potentially expensive data scientist types are needed to drive. On the other hand, there is an abundance of higher level analytical tools and pre-packaged applications emerging to support the existing business analyst and user with familiar tools and interfaces. While completely new companies have been founded on the exciting information and operational intelligence gained from exploiting big data, we expect wider adoption by existing organizations based on augmenting traditional lines of business with new insight and revenue enhancing opportunity. In addition, a Hadoop infrastructure serves as a great data capture and ETL base for extracting more structured data to feed downstream workflows, including traditional BI/DW solutions. No matter how you want to slice it, big data is becoming a common enterprise workload, and enterprise IT infrastructure folks will need to deploy, manage, and provide Hadoop services to their businesses.

Publish date: 10/01/13
news / Blog

Big Data Slowing You Down? Do It In Memory With ScaleOut Software hServer v2

In the run-up to Strata/Hadoop in NY coming up here at the end of October, we are hearing a lot about some exciting new ways to implement big data solutions. One of the most interesting is the recent release of ScaleOut's hServer V2, which evolves their high performance "in-memory data grid" (IMDG) to further support Hadoop workloads - gaining a reported 20x speedup on MapReduce jobs.

  • Premiered: 10/02/13
  • Author: Mike Matchett
Topic(s): Big Data ScaleOut hServer Hadoop
news / Blog

Choosing The Best Hadoop Infrastructure for Enterprise IT

We've published a new market landscape on Enterprise Hadoop Infrastructure aimed at helping IT folks survey, evaluate and choose the right Hadoop distribution and supporting server and storage infrastructure...One of the big takeaways from this analysis is that Hadoop is coming in a big way to enterprise IT organizations, whether they are familiar with big data architectures or not... we aimed to address the first two big questions about supporting big data in IT: 1. Which Hadoop distribution makes the most sense? 2. What is the right infrastructure/deployment model given Hadoop is available in physical, cloud, and virtual forms, with appliance, converged, and external storage options?

  • Premiered: 10/03/13
  • Author: Mike Matchett
Topic(s): Big Data Market Landscape Hadoop

Hadoop Coming to Enterprise IT in Big Way – Taneja Group

Hadoop is coming to enterprise IT in a big way. The competitive advantage that can be gained from analyzing big data is just too 'big' to ignore. And the amount of data available to crunch is only growing bigger, whether from new sensors, capture of people, systems and process 'data exhaust', or just longer retention of available raw or low-level details.

  • Premiered: 10/16/13
  • Author: Taneja Group
  • Published: Storage Newsletter
Topic(s): TBA Hadoop TBA Big Data TBA Storage TBA Apache TBA Virtualization TBA Dell TBA HP TBA Project Serengeti TBA VMWare TBA Mirantis TBA RedHat TBA Project Savanna TBA DDN TBA NetApp TBA Teradata TBA Oracle TBA EMC TBA Isilon

Big Data Storage Options for Enterprise Hadoop

In this webcast, Sr. IT Analyst Mike Matchett from Taneja Group will briefly review the storage architecture of Hadoop and HDFS, and then examine some of the more prominent big data storage options for enterprises with data protection, integration, and governance concerns that might lead them to choose an advanced SAN/NAS solution over the default local DAS design.

  • Premiered: 12/10/13 at 10 am PT/ 1 pm ET
  • Location: OnDemand
  • Speaker(s): Mike Matchett, Senior Analyst, Taneja Group
Topic(s): TBA Topic(s): BrightTALK Topic(s): TBA Topic(s): Mike Matchett Topic(s): TBA Topic(s): Hadoop Topic(s): TBA Topic(s): Storage Topic(s): TBA Topic(s): Enterprise Storage Topic(s): TBA Topic(s): SAN Topic(s): TBA Topic(s): NAS Topic(s): TBA Topic(s): DAS Topic(s): TBA Topic(s): HDFS Topic(s): TBA Topic(s): MapReduce