Why are GPUs necessary for training Deep Learning models?

Why are GPUs necessary for training Deep Learning models?

Most of you would have heard exciting stuff happening using deep learning. You would have also heard that Deep Learning requires a lot of hardware. I have seen people training a simple deep learning model for days on their laptops (typically without GPUs) which leads to an impression that Deep Learning requires big systems to run execute.

However, this is only partly true and this creates a myth around deep learning which creates a roadblock for beginners. Numerous people have asked me as to what kind of hardware would be better for doing deep learning. With this article, I hope to answer them.

Note: I assume that you have fundamental knowledge of deep learning concepts. If not, you should go through this article.

When I first got introduced with deep learning, I thought that deep learning necessarily needs large Datacenter to run on, and “deep learning experts” would sit in their control rooms to operate these systems.

This is because every book that I referred or every talk that I heard, the author or speaker always say that deep learning requires a lot of computational power to run on. But when I built my first deep learning model on my meager machine, I felt relieved! I don’t have to take over Google to be a deep learning expert

This is a common misconception that every beginner faces when diving into deep learning. Although, it is true that deep learning needs considerable hardware to run efficiently, you don’t need it to be infinite to do your task. You can even run deep learning models on your laptop!

Just a small disclaimer; the smaller your system, more is the time you will need to get a trained model which performs good enough. You may basically look like this:

Let’s just ask ourselves a simple question; why do we need more hardware for deep learning?

The answer is simple, deep learning is an algorithm – a software construct. We define an artificial neural network in our favorite programming language which would then be converted into a set of commands that run on the computer.

If you would have to guess which components of neural network do you think would require intense hardware resource, what would be your answer?

A few candidates from top of my mind are:

Among all these, training the deep learning model is the most intensive task. Lets see in detail why is this so.

When you train a deep learning model, two main operations are performed:

In forward pass, input is passed through the neural network and after processing the input, an output is generated. Whereas in backward pass, we update the weights of neural network on the basis of error we get in forward pass.

Both of these operations are essentially matrix multiplications. A simple matrix multiplication can be represented by the image below

Here, we can see that each element in one row of first array is multiplied with one column of second array. So in a neural network, we can consider first array as input to the neural network, and the second array can be considered as weights of the network.

This seems to be a simple task. Now just to give you a sense of what kind of scale deep learning – VGG16 (a convolutional neural network of 16 hidden layers which is frequently used in deep learning applications) has ~140 million parameters; aka weights and biases. Now think of all the matrix multiplications you would have to do to pass just one input to this network! It would take years to train this kind of systems if we take traditional approaches.

We saw that the computationally intensive part of neural network is made up of multiple matrix multiplications. So how can we make it faster?

We can simply do this by doing all the operations at the same time instead of doing it one after the other. This is in a nutshell why we use GPU (graphics processing units) instead of a CPU (central processing unit) for training a neural network.

To give you a bit of an intuition, we go back to history when we proved GPUs were better than CPUs for the task.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

AI and data science jobs are hot. Here’s what employers want

26 May, 2021

If you’re considering a career change, it might be a good time to start looking for a good coding course. …

Read more

With Customer Intelligence, the future begins today

20 Jan, 2017

This year’s Polish SAS Forum conference gathered more than 900 enthusiasts of the use of analytics for the generation of …

Read more

Why is Python Growing So Quickly?

24 Aug, 2019

We recently showed that, based on Stack Overflow question visits, Python has a claim to being the fastest-growing major programming …

Read more

Recent Jobs

Senior Cloud Engineer (AWS, Snowflake)

Remote (United States (Nationwide))

9 May, 2024

Read More

IT Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Data Engineer

Washington D.C., DC, USA

1 May, 2024

Read More

Applications Developer

Washington D.C., DC, USA

1 May, 2024

Read More

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.