Taneja Group | Apache+Spark
Join Newsletter
Forgot
password?
Register
Trusted Business Advisors, Expert Technology Analysts

Items Tagged: Apache+Spark

news

Making Sense of the Internet of Things with Converged Infrastructure

With its flexibility and scalability, converged infrastructure can be a good solution to the influx of IoT data.

  • Premiered: 03/22/16
  • Author: Taneja Group
  • Published: Windows IT Pro
Topic(s): TBA Internet of Things TBA IoT TBA converged TBA Converged Infrastructure TBA convergence TBA IT infrastructure TBA Servers TBA Storage TBA network TBA flexibility TBA scalability TBA Data protection TBA storage architecture TBA Hadoop TBA Apache TBA Spark TBA Apache Spark TBA structured data TBA Mike Matchett
news

Spark speeds up adoption of big data clusters and clouds

Infrastructure that supports big data comes from both the cloud and clusters. Enterprises can mix and match these seven infrastructure choices to meet their needs.

  • Premiered: 07/19/16
  • Author: Mike Matchett
  • Published: TechTarget: Search IT Operations
Topic(s): TBA Apache Spark TBA Spark TBA Mike Matchett TBA Cloud TBA cloud cluster TBA cluster TBA Big Data TBA big data analytics TBA MapReduce TBA Business Intelligence TBA BI TBA MLlib TBA High Performance TBA hadoop cluster TBA HDFS TBA Hadoop Distributed File System TBA IBM TBA Hortonworks TBA Cloudera TBA capacity management TBA Performance Management TBA API TBA SAN TBA storage area networks TBA CAPEX TBA DataDirect Networks TBA HPC TBA Lustre TBA Virtualization TBA VM
news

Kinetica Unveils GPU-accelerated Database for Analyzing Streaming Data with Enhanced Performance

Kinetica today announced the newest release of its distributed, in-memory database accelerated by GPUs that simultaneously ingests, explores, and visualizes streaming data.

  • Premiered: 09/21/16
  • Author: Taneja Group
  • Published: Business Wire
Topic(s): TBA high availability TBA Mike Matchett TBA Kinetica TBA In-Memory TBA Security TBA IoT TBA Internet of Things TBA Data Management TBA OLTP TBA CPU TBA GPU TBA NVIDIA TBA Data Center TBA scalability TBA Apache TBA Hadoop TBA Apache Hadoop TBA Apache Kafka TBA Apache Spark TBA Apache NiFi TBA High Performance TBA cluster TBA Big Data TBA scale-out
Profiles/Reports

Apache Spark Market Survey: Cloudera Sponsored Research

Apache Spark has quickly grown into one of the major big data ecosystem projects and shows no signs of slowing down. In fact, even though Spark is well connected within the broader Hadoop ecosystem, Spark adoption by itself has enough energy and momentum that it may very well become the center of its own emerging market category. In order to better understand Spark’s growing role in big data, Taneja Group conducted a major Spark market research project. We surveyed nearly seven thousand (6900+) qualified technical and managerial people working with big data from around the world to explore their experiences with and intentions for Spark adoption and deployment, their current perceptions of the Spark marketplace and of the future of Spark itself.

We found that across the broad range of industries, company sizes, and big data maturities represented in the survey, over one-half (54%) of respondents are already actively using Spark. Spark is proving invaluable as 64% of those currently using Spark plan to notably increase their usage within the next 12 months. And new Spark user adoption is clearly growing – 4 out of 10 of those who are already familiar with Spark but not yet using it plan to deploy Spark soon.

The top reported use cases globally for Spark include the expected Data Processing/Engineering/ETL (55%), followed by forward-looking data science applications like Real-Time Stream Processing (44%), Exploratory Data Science (33%), and Machine Learning (33%). The more traditional analytics applications like Customer Intelligence (31%) and BI/DW (29%) were close behind, and illustrate that Spark is capable of supporting many different kinds of organizational big data needs. The main reasons and drivers reported for adopting Spark over other solutions start with Performance (mentioned by 74%), followed by capabilities for Advanced Analytics (49%), Stream Processing (42%) and Ease of Programming (37%).

When it comes to choosing a source for Spark, more than 6 out of 10 Spark users in the survey have considered or evaluated Cloudera, nearly double the 35% that may have looked at the Apache Download or the 33% that considered Hortonworks. Interestingly, almost all (90+%) of those looking at Cloudera Spark adopted it for their most important use case, equating to 57% of those who evaluated Cloudera overall. Organizations cited quality of support (46%) as their most important selection factor, followed by demonstrated commitment to open source (29%), enterprise licensing costs (27%) and the availability of cloud support (also 27%).

Interestingly, while on-premise Spark deployments dominate today (more than 50%), there is a strong interest in transitioning many of those to cloud deployments going forward. Overall Spark deployment in public/private cloud (IaaS or PaaS) is projected to increase significantly from 23% today to 36%, along with a corresponding increase in using Spark SaaS, from 3% to 9%.

The biggest challenge with Spark, similar to what has been previously noted across the broader big data solutions space, is still reported by 6 out of 10 active users to be the big data skills/training gap within their organizations. Similarly, more than one-third mention complexity in learning/integrating Spark as a barrier to adoption. Despite these reservations, we note that compared to many previous big data analytics platforms, Spark today offers a higher—and often already familiar—level of interaction to users through its support of Python, R, SQL, notebooks, and seamless desktop-to-cluster operations, all of which no doubt contribute to its greatly increasing popularity and widespread adoption.

Overall, it’s clear that Spark has gained broad familiarity within the big data world and built significant momentum around adoption and deployment. The data highlights widespread current user success with Spark, validation of its reliability and usefulness to those who are considering adoption, and a growing set of use cases to which Spark can be successfully applied. Other big data solutions can offer some similar and overlapping capabilities (there is always something new just around the corner), but we believe that Spark, having already captured significant mindshare and proven real-world value, will continue to successfully expand on its own vortex of focus and energy for at least the next few years.

Publish date: 11/07/16
news

Apache Spark Survey Reveals Increased Growth in Users

In order to better understand Apache Spark’s growing role in big data, Taneja Group conducted a major market research project, surveying approximately 7,000 people.

  • Premiered: 11/08/16
  • Author: Taneja Group
  • Published: Satellite Press Releases
Topic(s): TBA Apache TBA Apache Hadoop TBA Apache Spark TBA Hadoop TBA Storage TBA Big Data TBA Data Management TBA Cloudera TBA In-Memory TBA Mike Matchett
news

Machine learning and data science workloads ignite Apache Spark adoption

The use of Apache Spark is dramatically increasing as new workloads create more use cases.

  • Premiered: 11/08/16
  • Author: Taneja Group
  • Published: CBR Online
Topic(s): TBA Apache TBA Apache Spark TBA Spark TBA Machine Learning TBA Big Data TBA Storage TBA Cloudera TBA Mike Matchett TBA analytics TBA Hadoop TBA Cloud TBA Public Cloud TBA Private Cloud TBA IBM TBA MapReduce
news

Four big data and AI trends to keep an eye on

AI is making a comeback - and it's going to affect your data center soon.

  • Premiered: 11/17/16
  • Author: Mike Matchett
  • Published: TechTarget: Search IT Operations
Topic(s): TBA AI TBA Artificial Intelligence TBA Big Data TBA Data Center TBA Datacenter TBA Machine Learning TBA Apache TBA Apache Spark TBA Spark TBA Hadoop TBA MapReduce TBA latency TBA In-Memory TBA big data analytics TBA Business Intelligence TBA Python TBA Dataiku TBA Cask TBA ETL TBA data flow management TBA Virtualization TBA Storage TBA scale-up TBA scale-out TBA scalability TBA GPU TBA IBM TBA NVIDIA TBA Virtual Machine TBA VM
news / Blog

The New Big Thing in Big Data: Results From Our Apache Spark Survey

In the last few months I’ve been really bullish on Apache Spark as an big enabler of wider big data solution adoption. Recently we got the great opportunity to conduct some deep Spark market research (with Cloudera’s sponsorship) and were able to survey nearly seven thousand (6900+) highly qualified technical and managerial people working with big data from around the world. ... Some highlights -- First, across the broad range of industries, company sizes, and big data maturities, over one-half (54%) of respondents are already actively using Spark to solve a primary organizational use case. That’s an incredible adoption rate....

  • Premiered: 12/14/16
  • Author: Mike Matchett
Topic(s): Apache Spark Big Data Cloudera
news

New IT job requirements include soft skills, business acumen

In the constantly changing world of information technology, business acumen and soft skills have become as essential to finding and securing a job as technical skills.

  • Premiered: 12/27/16
  • Author: Taneja Group
  • Published: TechTarget: Search Server Virtualization
Topic(s): TBA Storage TBA Virtualization TBA Mike Matchett TBA DevOps TBA VMWare TBA Spark TBA Apache Spark TBA SQL TBA graph database TBA Big Data TBA Machine Learning
news

With Apache Spark, Old Mainframes Learn New Tricks

Running Spark on the mainframe can be advantageous because data is co-located. One use is fraud detection.

  • Premiered: 12/26/16
  • Author: Taneja Group
  • Published: RT Insights
Topic(s): TBA Apache TBA Apache Spark TBA Spark TBA IBM TBA ETL TBA Cloudera TBA Big Data
news

Impetus Technologies Announces StreamAnalytix 3.0 Feat. Support for Apache Spark-Based Batch Process

Impetus Technologies, a big data thought leader and software solutions company, today announced StreamAnalytix 3.0 featuring support for Apache Spark-based batch processing and enriched online and offline machine learning features, helping enterprises maximize the performance of their analytical models and achieve the most favorable business outcomes.

  • Premiered: 03/15/17
  • Author: Taneja Group
  • Published: Yahoo! Finance
Topic(s): TBA StreamAnalytix TBA Impetus Technologies TBA Mike Matchett TBA Apache Spark TBA Spark TBA Apache TBA Machine Learning TBA big data analytics TBA Big Data TBA Hadoop TBA Apache Hadoop TBA NoSQL TBA Apache Kafka TBA Apache Storm TBA Internet of Things TBA IoT TBA ETL TBA Cloudera TBA Hortonworks TBA MapR TBA Amazon AWS TBA AWS TBA Amazon TBA Amazon S3