Machine learning has been around for a while, so even if you haven’t worked on it as a developer, you’re probably very familiar with it as a consumer. When you add something to your cart in Amazon, and see a list of other recommended products that you might also like—that's an example of machine learning. Essentially, machine learning is the development of computer programs that can learn and create their own rules, based on data.
Developing machine learning applications is different than developing standard applications. Instead of writing code that solves a specific problem, machine learning developers create algorithms that are able to take in data and then build their own logic based on that data. In the Amazon example, data about customer behavior and sales is used to determine which products you're most likely to also be interested in. It isn't looking at a 1:1 relationship between what's in your cart and another specific product—like something a marketer or sales person recommended selling together—instead it's taking into account all of the existing data, from all visits and all sales, and using that to predict behavior and determine recommendations that make sense. New products—and new data—are always being input, so the recommendation results are continuously adjusting and improving.
Why should you care about machine learning now? With the current increase in IoT and connected devices, we now have access to so much more data—and along with it, an increased need to manage and understand what we know.
Also, because so many different industries are starting to rely on machine learning, you have a great opportunity as a developer to learn how it works and how it might bring value to your product.
Types of Machine Learning Algorithms
The training data consist of labeled inputs and known outcomes, which the machine studies until it can apply the label on its own. For example, to create a face detection algorithm, you might provide images of landscapes, people, animals, buildings, and so on, with their respective labels until the machine could reliably recognize a face in an unlabeled image.
The machine analyzes unlabeled data and categorizes it based on similarities it has identified. So, you might provide the same photos as in the above example, but without their labels. The machine would still be able to cluster images based on shared characteristics (the sharp lines of a cityscape vs. the round shape of a face, for example)—but it would not be able to say that that round shape is a “face.” These programs are used to identify groupings within data sets that may be difficult or impossible for a human to see.
A combination of the above, used when there is a large amount of data but only some of it is labeled. Unsupervised learning techniques might be used to group and cluster the unlabeled data, while supervised learning techniques can be used to predict labels for it.
Uses simple reward data to train the machine on ideal behavior within a specific context.
Faster than We Can Do by Hand
The biggest advantage to machine learning is that it allows us to do things much more quickly than we'd be able to do otherwise. It can't solve problems that a human being couldn't also solve, but it can take in a huge amount of data and very quickly build connections and predictions based on it. That becomes even more important as we continue to expand the amount of data we're generating through IoT and connected devices.
Data Innovation Summit 2017
30% off with code 7wData
Big Data Innovation Summit London
$200 off with code DATA200
Enterprise Data World 2017
$200 off with code 7WDATA
Data Visualisation Summit San Francisco
$200 off with code DATA200
Chief Analytics Officer Europe
15% off with code 7WDCAO17