Why are GPUs necessary for training Deep Learning models?
- by 7wData
Most of you would have heard exciting stuff happening using deep learning. You would have also heard that Deep Learning requires a lot of hardware. I have seen people training a simple deep learning model for days on their laptops (typically without GPUs) which leads to an impression that Deep Learning requires big systems to run execute.
However, this is only partly true and this creates a myth around deep learning which creates a roadblock for beginners. Numerous people have asked me as to what kind of hardware would be better for doing deep learning. With this article, I hope to answer them.
Note: I assume that you have fundamental knowledge of deep learning concepts. If not, you should go through this article.
When I first got introduced with deep learning, I thought that deep learning necessarily needs large Datacenter to run on, and “deep learning experts” would sit in their control rooms to operate these systems.
This is because every book that I referred or every talk that I heard, the author or speaker always say that deep learning requires a lot of computational power to run on. But when I built my first deep learning model on my meager machine, I felt relieved! I don’t have to take over Google to be a deep learning expert
This is a common misconception that every beginner faces when diving into deep learning. Although, it is true that deep learning needs considerable hardware to run efficiently, you don’t need it to be infinite to do your task. You can even run deep learning models on your laptop!
Just a small disclaimer; the smaller your system, more is the time you will need to get a trained model which performs good enough. You may basically look like this:
Let’s just ask ourselves a simple question; why do we need more hardware for deep learning?
The answer is simple, deep learning is an algorithm – a software construct. We define an artificial neural network in our favorite programming language which would then be converted into a set of commands that run on the computer.
If you would have to guess which components of neural network do you think would require intense hardware resource, what would be your answer?
A few candidates from top of my mind are:
Among all these, training the deep learning model is the most intensive task. Lets see in detail why is this so.
When you train a deep learning model, two main operations are performed:
In forward pass, input is passed through the neural network and after processing the input, an output is generated. Whereas in backward pass, we update the weights of neural network on the basis of error we get in forward pass.
Both of these operations are essentially matrix multiplications. A simple matrix multiplication can be represented by the image below
Here, we can see that each element in one row of first array is multiplied with one column of second array. So in a neural network, we can consider first array as input to the neural network, and the second array can be considered as weights of the network.
This seems to be a simple task. Now just to give you a sense of what kind of scale deep learning – VGG16 (a convolutional neural network of 16 hidden layers which is frequently used in deep learning applications) has ~140 million parameters; aka weights and biases. Now think of all the matrix multiplications you would have to do to pass just one input to this network! It would take years to train this kind of systems if we take traditional approaches.
We saw that the computationally intensive part of neural network is made up of multiple matrix multiplications. So how can we make it faster?
We can simply do this by doing all the operations at the same time instead of doing it one after the other. This is in a nutshell why we use GPU (graphics processing units) instead of a CPU (central processing unit) for training a neural network.
To give you a bit of an intuition, we go back to history when we proved GPUs were better than CPUs for the task.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More