The Top Predictive Analytics Pitfalls to avoid

The Top Predictive Analytics Pitfalls to avoid

Predictive Analytics can yield amazing results.  The lift that can be achieved by basing future decisions from observed patterns in historical events can far outweigh anything that can be achieved by relying on gut-feel or being guided by anecdotal events.  There are numerous examples that demonstrate the possible lift that can be achieved across all possible industries, but a test we did recently in the retail sector showed that applying stable predictive models gave us a five-fold increase in the take-up of the product when compared against a random sample.  Let’s face it, there would not be so much focus on

Predictive Analytics and in particular Machine Learning if it was not yielding impressive results.

Read about the lessons we learned while using Machine Learning to predict the 2015 Rugby World Cup results
But predictive models are not bullet proof.  They can be a bit like race horses: somewhat sensitive to changes and with a propensity to leave the rider on the ground wondering what on earth just happened.

The commoditising of Machine Learning is making data science a lot more accessible to the non data scientists of the world than ever before. With this in mind, my colleague and I sat and pondered, and we devised the following list of top predictive analytics pitfalls to avoid in order to keep your models performing as expected:
Making incorrect assumptions on the underlying training data.

Rushing in and making too many assumptions on the underlying training data can often lead to egg on the proverbial face. Take time to understand the data and trends in the distributions, missing values, outliers, etc.

Working with low volumes.
Low volumes is the data scientist’s unhappy place – they can lead to statistically weak, unstable and unreliable models.

The over-fitting chestnut.
In other words, creating a model that has many branches and therefore seems to provide better discrimination of the target variable, but falls over in the real world as it has introduced too much noise into the model.

Bias in the training data.
For example, you only offered a certain product to the Millennials. So, guess what? The Millennials are going to come through strongly in the model.

Including test data in the training data.
There have been a few epic fails where the test data has been included in the training data – giving the impression that the model is going to perform fantastically, but in reality results in a broken model. In the predictive analytics world, if the results are too good to be true, it is worth spending more time on your validations and even getting a second opinion to check over your work.

Not being creative with the provided data.
Predictive models can be significantly improved by creating some clever characteristics or features that can be used to better explain the trends in the data. Too often data scientists will work with what has been provided and will not spend enough time considering more creative features from the underlying data that can strengthen the models in ways that an improved algorithm cannot achieve.

Expecting machines to understand business.
Machines cannot figure out (yet) what the business problem is and how best to tackle the problem. This is not always straight forward and can require some careful thought, involving wholesome discussions with the business stakeholders.

Using the wrong metric to measure the performance of a model.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Does Your C-Suite Have Enough Digital Smarts?

7 Mar, 2021

Executive teams that understand how to wield the power of digital technologies are rare, but they deliver huge premiums in …

Read more

Preparing the Smart Machine Platform and Data Analysis Tools for Tomorrow’s Workers

28 May, 2017

In the future, they will determine the precise date when the traditional notion of privacy expired — probably some moment …

Read more

5 Actionable Ways To Improve Your Big Data Visualization

8 Oct, 2022

With big data visualization techniques, you can turn every large, complex data set into easy-to-understand graphs, infographics, charts, videos, and …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.