Ten Myths About Machine Learning

Ten Myths About Machine Learning, by Pedro Domingos

Ten Myths About Machine Learning, by Pedro Domingos

Machine learning used to take place behind the scenes: Amazon mined your clicks and purchases for recommendations, Google mined your searches for ad placement, and Facebook mined your social network to choose which posts to show you. But now machine learning is on the front pages of newspapers, and the subject of heated debate. Learning algorithms drive cars, translate speech, and win at Jeopardy! What can and can’t they do? Are they the beginning of the end of privacy, work, even the human race? This growing awareness is welcome, because machine learning is a major force shaping our future, and we need to come to grips with it. Unfortunately, several misconceptions have grown up around it, and dispelling them is the first step. Let’s take a quick tour of the main ones:

Machine learning is just summarizing data. In reality, the main purpose of machine learning is to predict the future. Knowing the movies you watched in the past is only a means to figuring out which ones you’d like to watch next. Your credit record is a guide to whether you’ll pay your bills on time. Like robot scientists, learning algorithms formulate hypotheses, refine them, and only believe them when their predictions come true. Learning algorithms are not yet as smart as scientists, but they’re millions of times faster.

Read Also:
What animal shelters taught me about digital transformation

Learning algorithms just discover correlations between pairs of events. This is the impression you get from most mentions of machine learning in the media. In one famous example, an increase in Google searches for “flu” is an early sign that it’s spreading. That’s all well and good, but most learning algorithms discover much richer forms of knowledge, such as the rule If a mole has irregular shape and color and is growing, then it may be skin cancer.

Machine learning can only discover correlations, not causal relationships. In fact, one of the most popular types of machine learning consists of trying out different actions and observing their consequences — the essence of causal discovery. For example, an e-commerce site can try many different ways of presenting a product and choose the one that leads to the most purchases. You’ve probably participated in thousands of these experiments without knowing it. And causal relationships can be discovered even in some situations where experiments are out of the question, and all the computer can do is look at past data.

Read Also:
Drones are really data gathering machines. Here's how one startup aims to cash in

Machine learning can’t predict previously unseen events, a.k.a. “black swans.” If something has never happened before, its predicted probability must be zero — what else could it be? On the contrary, machine learning is the art of predicting rare events with high accuracy. If A is one of the causes of B and B is one of the causes of C, A can lead to C, even if we’ve never seen it happen before. Every day, spam filters correctly flag freshly concocted spam emails. Black swans like the housing crash of 2008 were in fact widely predicted — just not by the flawed risk models most banks were using at the time.

The more data you have, the more likely you are to hallucinate patterns. Supposedly, the more phone records the NSA looks at, the more likely it is to flag an innocent as a potential terrorist because he accidentally matched a terrorist detection rule. Mining more attributes of the same entities can indeed increase the risk of hallucination, but machine learning experts are very good at keeping it to a minimum.

Read Also:
Why Predictive Hiring Algorithms Are a Corporate Recruiter’s Best Friend


Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Visual Business Intelligence – When More is Less

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
Drones are really data gathering machines. Here's how one startup aims to cash in

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
Real or virtual? The two faces of machine learning

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
Why Predictive Hiring Algorithms Are a Corporate Recruiter’s Best Friend

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
Visual Business Intelligence – When More is Less

Leave a Reply

Your email address will not be published. Required fields are marked *