Big Data is a term that is used in the information technology industry to mean building multiple sources of information together into a data lake, a data repository built on relatively inexpensive high performing computer hardware. The value of your data can be extracted from a data lake through existing reporting and business analytic systems. Furthermore, the advent of machine learning capabilities for Big Data solutions provides additional analytical capabilities. Machine learning has derived meaningful insights which can be used to support business development and organizational growth.
Big Data in its current form will reduce your operational and infrastructure costs, but will not provide you with any additional value for your business over what enterprise data warehouses provides. Why is that? The machine learning of today, that is employed within Big Data solutions, has no more capability than the statistical packages that are already in use within enterprise data warehouses solutions.
This may be true for today’s current big data incarnation. However, the future holds a new set of machine learning tools that integrate with Big Data. These new machine learning tools fall under the heading of neural networks. It might be helpful to first understand what neural networks can and cannot do.
First and foremost, neural networks cannot think! More about that later. Neural networks have the capability for classification, regression analytics and forecasting. Given those capabilities, here is a small but amazing sample of uses that neural networks excel at:
There are more spectacular uses, but the above list will give you some idea that Big Data platforms are only in their infancy.
It is important to understand that not all neural networks are created equally. Picking a neural network that doesn't align to the specific problem that your trying to solve will result in poor accuracy and performance.
To get a sense of how different neural networks are used, below is a small sampling of uses and the neural networks that best fit the problem space and how they align to their capabilities. Note: It is beyond the scope of this article to go into detail about how each of these networks works.
Neural networks need to be trained. The process for training a neural network is called “back propagation.” Back propagation takes a lot of time to train a network using conventional CPUs. This is why the neural network community has turned to using graphical processor units (GPUs), as they are 250 times faster in training a neural network. That's the difference between one day of training and over eight months using conventional CPUs.