mlp

What is the Difference Between Deep Learning and “Regular” Machine Learning?

What is the Difference Between Deep Learning and “Regular” Machine Learning?

Another concise explanation of a machine learning concept by Sebastian Raschka. This time, Sebastian explains the difference between Deep learning and "regular" machine learning.

That's an interesting question, and I try to answer this is a very general way. The tl;dr version of this is: Deep learning is essentially a set of techniques that help we to parameterize deep neural network structures, neural networks with many, many layers and parameters.

And if we are interested, a more concrete example: Let's start with multi-layer perceptrons (MLPs)...

On a tangent: The term "perceptron" in MLPs may be a bit confusing since we don't really want only linear neurons in our network. Using MLPs, we want to learn complex functions to solve non-linear problems. Thus, our network is conventionally composed of one or multiple "hidden" layers that connect the input and output layer. Those hidden layers normally have some sort of sigmoid activation function (log-sigmoid or the hyperbolic tangent etc.). For example, think of a log-sigmoid unit in our network as a logistic regression unit that returns continuous values outputs in the range 0-1. A simple MLP could look like this

Read Also:
Emerging economies need to harness the power of Big Data

where y_hat is the final class label that we return as the prediction based on the inputs (x) if this are classification tasks. The "a"s are our activated neurons and the "w"s are the weight coefficients. Now, if we add multiple hidden layers to this MLP, we'd also call the network "deep." The problem with such "deep" networks is that it becomes tougher and tougher to learn "good" weights for this network. When we start training our network, we typically assign random values as initial weights, which can be terribly off from the "optimal" solution we want to find. During training, we then use the popular backpropagation algorithm (think of it as reverse-mode auto-differentiation) to propagate the "errors" from right to left and calculate the partial derivatives with respect to each weight to take a step into the opposite direction of the cost (or "error") gradient. Now, the problem with deep neural networks is the so-called "vanishing gradient" -- the more layers we add, the harder it becomes to "update" our weights because the signal becomes weaker and weaker. Since our network's weights can be terribly off in the beginning (random initialization) it can become almost impossible to parameterize a "deep" neural network with backpropagation.

Read Also:
Big data, data standards and cross-platform integration

Now, this is where "deep learning" comes into play. Roughly speaking, we can think of deep learning as "clever" tricks or algorithms that can help we with the training of such "deep" neural network structures.

 



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Emerging economies need to harness the power of Big Data

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
The Primary Principles of Business-Incident Detection for BI -

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
How Semantic Data Analytics Benefits Population Health Management

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
The Primary Principles of Business-Incident Detection for BI -

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
Real data scientists have a rare hybrid of skill sets: Here's what to look for
Read Also:
Real data scientists have a rare hybrid of skill sets: Here's what to look for

Leave a Reply

Your email address will not be published. Required fields are marked *