Strata HadoopWorld Fall 2016 postmortem: Maybe AI's the future

Strata HadoopWorld Fall 2016 postmortem: Maybe AI’s the future, but can we make the data science work?

Strata HadoopWorld Fall 2016 postmortem: Maybe AI’s the future, but can we make the data science work?

Given all the hype over artificial intelligence (AI) these days, at first glance it would seem surprising that it appeared as almost an afterthought at Strata last week.

There were a handful of product announcements, like Maana, which added semantic search-like capabilities in its newest release of its knowledge management platform for resource-intensive industries like oil and gas; and Splunk, which grafted machine learning to its offerings for identifying and resolving incidents from IT system log files.

And in a keynote talk entitled "Connected Eyes," Microsoft's Joseph Sirosh spoke of a project with India's leading eye institute that applied machine learning over large patient populations to improve outcomes for eye surgery.

But this obscures the bigger picture. Conference sponsor O'Reilly acknowledged this by breaking out AI into a separate pre-event track the day before. And anyway, this wasn't a Google Cloud event, where AI was front and center.

So, get used to it. There's plenty of hype going around whether AI can, will, or should replace humans (spoiler alert: the answers are "not"). But even if present-day AI is no smarter than a bunch of idiot savants, there are plenty of practical and often unglamorous jobs that AI's core ingredient, machine learning (ML), is already performing.

Read Also:
How Big Data is Changing The Way You Fly

Last year at Strata, we saw ML becoming almost ubiquitous in tooling for data management and governance of data lakes from providers from A to Z.

The rationale for using ML, rather than static governance rules, is due to the nature of data lakes. Unlike data warehouses, you won't know exactly what data will flow in, and so therefore, it won't be practical to build rules ahead of time dictating schema, data quality, de-duping, or identifying what data is likely to be sensitive (even weblogs could give PII data away).

Governance, whether it involves preparing data, building a catalog, and identifying master or reference data may be a moving target requiring the system to "learn" how the norms are changing.

And there's ML elsewhere as well. Providers like Cloudera build ML into the trouble ticket tracking that backs the automated "phone home" function of subscriber client technical support.

As we noted with our take on DataRobot, there is a growing array of tools aimed at simplifying or accelerating different aspects of the lifecycle of building and deploying ML programs.

Read Also:
How to make big data more manageable

And ML is showing up in end user analytic tools that help humans parse the signals in data, wrangle it into shape, suggest which questions to ask, and help piece together the narrative.

In other words, when it comes to the packaged software tools that govern big data or analyze it, we're probably starting to take embedded machine learning for granted.

But what if your own data scientists want to get their own hands dirty? As we noted a few weeks back, there's a lot of pent up enthusiasm among R and Python programmers for ML, which many look at as the latest shiny, new thing.

But for all the enthusiasm, at least among Spark users, SQL and streaming are more frequent workloads according to the 2016 Spark Survey just released by Databricks.

 



Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Why the cloud could hold the cure to diseases
Read Also:
Data Preparation Tips, Tricks, and Tools

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
Why the cloud could hold the cure to diseases

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
Keys to Working With Big Data

SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
Data Preparation Tips, Tricks, and Tools

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Big Data, Open Data and the Need for Data Transparency (Industry Perspective)

Leave a Reply

Your email address will not be published. Required fields are marked *