Blog 8: Statistics Denial Myth #4, Rebranding Predictive Modeling

This blog discusses the ambition of rebranding predictive modeling as predictive analytics; and why this will enlarge the coming flood of statistical malfeasance.  The prediction problem involves uncertainty and statistics provides us with the tools, language, and thinking for addressing numbers with uncertainty.  We have provided a problem-based clarification of statistics in Analytics Magazine.  This should help people better identify statistics problems.

Rebranding & Mischaracterizing Predictive Modeling:

The rebranding of predictive modeling as predictive analytics is on its surface no more harmful than selling 'pre-owned cars' instead of 'used cars.'  The harm comes when rebranding mischaracterizes statistics and circumvents best practice.  As mentioned before, the concern with rebranding is that the next step is to strip away everything not understood by those merely following recipes.

Here is a quote that captures the problematic mischaracterization, 'PREDICTIVE ANALYTICS does NOT require an understanding of “STATISTICS / TRADITIONAL p-value STATISTICS” …. Period !!!!!'  This objection spreads rudimentary misunderstandings about statistics.  Here is another quote, 'It [predictive modeling] is not a [sub]field of statistics.'

[pullquote cite="W. Edwards Deming" type="right"]The only useful function for a statistician is to make predictions, and thus provide a basis for action.[/pullquote]

First, prediction has always been a subfield of statistics.  We need to look no further than the fact that prediction involves uncertainty.  Let us recap the four common objectives for statistics models: coefficient estimation, prediction (there it is!), grouping, and ranking.  Those new to the subject want to rename and reclassify everything as part of their rediscovery.

Second, note the qualification from the first quote, 'statistics/traditional p-value statistics.'  This is like claiming that division does not require an understanding of 'mathematics/traditional addition.'

Here is a third quote, "PA [Predictive Analytics] and DS [Data Science] both contrast with statistics in their emphasis on prediction over causality and their general use of observational in contrast to experimental methods."  PA and DS are rebrandings of predictive modeling and statistics, respectively with no change in content.  All of the assumptions, thinking, and tools for dealing with uncertainty are statistical.

First, that 'predictive analytics emphasizes prediction over statistics' is just babble.  Similarly, we could claim that predictive modeling emphasizes prediction over statistics and sampling emphasizes sampling over statistics too.  We could claim that 'Division Scientists' perform more division than Mathematicians.  This does not express a new value proposition for predictive analytics.  In general, this boils down to a comparison between topical areas like commercial statistics and clinical statistics.  This distinction is lost when an applied statistician moves from clinical to commercial.

Second, the same confused type of claim is repeated with the idea that predictive analytics is more about observational data than statistics.  Again, this is like claiming that predictive modeling is more about observational data than statistics.  There is nothing new in this rebranding that is not in predictive modeling—a subfield of statistics.

Third, statistics places a heavy emphasis on analyzing observational data and it does this in a number of subfields: predictive modeling, DoS (Design of Samples), QC/PC (Quality Control/Process Control), Times Series, EDA (Exploratory Data Analysis), et al.  Hence, statistics is more about observational data than predictive analytics.  Observational data contains uncertainty.

Close:

Predictive analytics is a rebranding of predictive modeling for promotional purposes.  We can be certain that prediction is a statistics problem because it involves numbers with uncertainty.  Claiming prediction does not require an understanding of statistics is like claiming that division does not require an understanding of mathematics.

Rebranding can have some benefits if performed thoughtfully.  However, we think that there is nothing thoughtful or measured in denying the value proposition of statistics.  The downside of rebranding is that important parts can be omitted just because they are not understood by recipe followers.  We have noticed that self-professed experts in predictive analytics seldom discuss prediction intervals!?  Corrupting prediction modeling will facilitate a flood of statistical malfeasance.

We sure could use Deming, right now.  Many of us, who consume or produce data analysis, hang out in the new LinkedIn group: About Data Analysis.  Come see us.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Randy Bartlett

Randy Bartlett

Statistician/Statistical Data Scientist at Blue Sigma Analytics

Randy Bartlett, Ph.D. CAP® PSTAT® is a statistician/statistical data scientist with 20+ years of practice experience analyzing and reviewing data analysis; and leading business analytics teams. He designed 'A Practitioner’s Guide to Business Analytics' to be the foremost reference on how corporations can better implement business analytics and in this era of Big Data.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Who came up with the name Big Data?

27 Jan, 2017

Big Data has truly come of age in 2013 when Oxford English Dictionary introduced the term “Big Data” for the …

Read more

Why Data Decays so Fast

10 Feb, 2017

People change jobs, get promoted and move home. Companies go out of business, expand and relocate.  Every one of these …

Read more

Blog 9: Statistics Denial Myths #5-6, Mischaracterizing Statistical Significance

29 Sep, 2015

Myth #5 builds upon the old confusion around significance testing that comprises this second ‘ancient’ myth (#6).  Suppose that you …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.