In this article, I cover the 12 types of AI problems i.e. I address the question : in which scenarios should you use Artificial Intelligence (AI)? We cover this space in the Enterprise AI course
Recently, I conducted a strategy workshop for a group of senior executives running a large multi national. In the workshop, one person asked the question: How many cats does it need to identify a Cat?
This question is in reference to Andrew Ng’s famous paper on Deep Learning where he was correctly able to identify images of Cats from YouTube videos. On one level, the answer is very clear: because Andrew Ng lists that number in his paper. That number is 10 million images .. But the answer is incomplete because the question itself is limiting since there are a lot more details in the implementation – for example training on a cluster with 1,000 machines (16,000 cores) for three days. I wanted to present a more detailed response to the question. Also, many problems can be solved using traditional Machine Learning algorithms – as per an excellent post from Brandon Rohrer – which algorithm family can answer my question. So, in this post I discuss problems that can be uniquely addressed through AI. This is not an exact taxonomy but I believe it is comprehensive. I have intentionally emphasized Enterprise AI problems because I believe AI will affect many mainstream applications – although a lot of media attention goes to the more esoteric applications.
Firstly, let us explore what is Deep Learning
Deep learning refers to artificial neural networks that are composed of many layers. The ‘Deep’ refers to multiple layers. In contrast, many other machine learning algorithms like SVM are shallow because they do not have a Deep architecture through multiple layers. The Deep architecture allows subsequent computations to build upon previous ones. We currently have deep learning networks with 10+ and even 100+ layers.
The presence of multiple layers allows the network to learn more abstract features. Thus, the higher layers of the network can learn more abstract features building on the inputs from the lower layers. A Deep Learning network can be seen as a Feature extraction layer with a Classification layer on top. The power of deep learning is not in its classification skills, but rather in its feature extraction skills. Feature extraction is automatic (without human intervention) and multi-layered.
The network is trained by exposing it to a large number of labelled examples. Errors are detected and the weights of the connections between the neurons adjusted to improve results. The optimisation process is repeated to create a tuned network. Once deployed, unlabelled images can be assessed based on the tuned network.
Feature engineering involves finding connections between variables and packaging them into a new single variable is called. Deep Learning performs automated feature engineering. Automated feature engineering is the defining characteristic of Deep Learning especially for unstructured data such as images. This matters because the alternative is engineering features by hand. This is slow, cumbersome and depends on the domain knowledge of the people/person performing the Engineering
Deep Learning suits problems where the target function is complex and datasets are large but with examples of positive and negative cases. Deep Learning also suits problems that involve Hierarchy and Abstraction.
Abstraction is a conceptual process by which general rules and concepts are derived from the usage and classification of specific examples. We can think of an abstraction as the creation of a ‘super-category’ which comprises of the common features that describe the examples for a specific purpose but ignores the ‘local changes’ in each example. For example, the abstraction of a ‘Cat’ would comprise fur, whiskers etc.