businessman pressing button on virtual screens

More Organizations Kicking the Tires of Spark As Data Tool of Choice

More Organizations Kicking the Tires of Spark As Data Tool of Choice

One would obviously expect Hadoop to dominate the discussions at the recent Strata & Hadoop World conference in San Jose, CA. But much of the buzz this year was around Apache Spark, and how Spark might fit into the data management strategies of many organizations.

Arno Candel, chief architect at H20.ai,shared his observations with Information Management on what conference attendees were most interested in, and how those needs are influencing his company’s go-to-market strategies.

Information Management: What are the most common themes that you heard among conference attendees and how do those themes align with what you expected?

Arno Candel: Many of the people I spoke with were interested in how Spark can, or would, fit into their overall data management and analytics strategy. While we at H2O.ai have been seeing increasing interest in Spark, which was one of the reasons that we built out Sparkling Water, our Spark API, I’ve always thought of Strata as a Hadoop conference - it is after all merged with Hadoop World.

Read Also:
16 Free and Open-Source Business Intelligence Tools

It’s now clear that data storage is essentially a solved problem, while in-memory analytics and machine learning are driving most of the ongoing work in the field. We see ourselves as very much aligned with this trend.

IM: What are the most common data challenges that attendees are facing?

AC: Turning data into actionable insights has been, and remains, a key challenge for many organizations. Everyone has been told that they need to store more and more information in data stores like Hadoop, but there is often a lack of a plan for the “day after.” What do organizations do once they’ve stored all their data in a data lake? They realize that they need some kind of analytics strategy, but aren’t sure exactly what that should look like.

In addition, there is a huge problem with regards to data cleansing; much of the data that organizations have stored is messy, has missing variables, etc. and organizations need to find a way to deal with that.

Read Also:
7 Ways To Leverage Your Small Business Data For Enhanced Revenues

 



Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
16 Free and Open-Source Business Intelligence Tools

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
4 steps to make DevOps safe, secure, and reliable

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
The Interview: Nigel Turner On How to Succeed In Data Governance

SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
HPE and BlueData − A Game-Changing Combination for Big Data

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
How To Read Analytics Clues for a Cross-Device Marketing Strategy

Leave a Reply

Your email address will not be published. Required fields are marked *