The Data Science Behind AI

The Data Science Behind AI

The Data Science Behind AI

Summary:  For those of you traditional data scientist who are interested in AI but still haven’t given it a deep dive, here’s a high level overview of the data science technologies that combine into what the popular press calls artificial intelligence (AI).

We and others have written quite a bit about the various types of data science that make up AI.  Still I hear many folks asking about AI as if it were a single entity.  It is not.  AI is a collection of data science technologies that at this point in development are not even particularly well integrated or even easy to use.  In each of these areas however, we’ve made a lot of progress and that’s caught the attention of the popular press.

This article is not intended to be a deep dive but rather the proverbial 50,000 foot view of what’s going on.  If you’re a traditional data scientist who’s read some articles but still hasn’t put the big picture together you might find this a way of integrating your current knowledge and even discovering where you’d be interested in focusing.

AI is Simply the Sum of its Data Science Parts

The data science ‘parts’ that make up AI fall in into the following categories.  There is overlap here but these are the detailed topics you’ll see in the press.

Read Also:
How Big Data Brought Ford Back from the Brink

These are all separate disciplines (OK the category of Deep Learning actually contains some of the others).  AI is simply the sum of these parts. They hang together only very loosely and have been bolted together into some really marvelous applications by a whole host of startups and major players.  When they work well together as they do for example in Watson, or Echo/Alexa, or as they are starting to do in self-driving cars then they may appear to be more than the sum of their parts.  But they’re not.  Integration of these different technologies is still one of the biggest challenges.

What Must Our AI be Able to Do?

When explaining this to beginners I always find it helpful to start with this anthropomorphic description of what human-like capabilities our AI would need to have.

See:  this is still and video image recognition.

Speak:  respond meaningfully to our input either in the same language or even a foreign language.

Learn:  change its behavior based on changes in its environment.

Read Also:
Teva, IBM to tackle new drugs, chronic diseases with AI

You can immediately begin to see that many of the commercial applications of AI that are emerging today require only a few of these capabilities.  But the more sophisticated applications that we’re looking forward to would need to have pretty much all of these.

Here’s where this gets a little messy.  Each of these capabilities don’t necessarily line up one-to-one with their underlying data science.  But how the data science matches up with these requirements is the most important part of truly understanding what’s going on in AI today.  As a diagram they would match up more or less like this:

You may have noticed that ‘Deep Learning’ is missing from our chart.  That’s because it is a summary category for the Recurrent Neural Nets and Convolutional Neural Nets above.  Artificial Neural Nets (ANNs), the highest summary level have been around since the 80’s and have always been part of the standard data science machine learning tool kit for solving standard classification and regression problems.

What’s happened recently is that our massive increases in parallel processing, cloud processing, and the use of GPU (graphical processing units) instead of traditional Intel chips have allowed us to experiment with versions of ANNs that have dozens or even more than a hundred hidden layers.  These hidden layers are what causes us to call these types ‘deep’, hence ‘deep learning’.  Adding hidden layers means multiplying computational complexity which is why we had to wait for the hardware to catch up with our ambitions.

Read Also:
Predictive analytics – knowledge is power


Data Innovation Summit 2017

30
Mar
2017
Data Innovation Summit 2017

30% off with code 7wData

Read Also:
Why CEOs must lead big data initiatives

Big Data Innovation Summit London

30
Mar
2017
Big Data Innovation Summit London

$200 off with code DATA200

Read Also:
Analysis of 170,000 Kickstarter Campaigns Reveals 3 Fundamentals of a Crowdfunding Success

Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
Can Machines Deep Learn Project Management?

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
40% of data science tasks will be automated by 2020

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Google’s new translation software is almost as good as human translators

Leave a Reply

Your email address will not be published. Required fields are marked *