Why Data Science Isn’t an Exact Science

Why Data Science Isn't an Exact Science

Organizations adopt data science with the goal of getting answers to more types of questions, but those answers are not absolute.

Business professionals have traditionally viewed the world in concrete terms and sometimes even round numbers. That legacy perspective is black and white compared to the shades of gray that data science produces. Instead of producing a single number result such as 40%, the result is probabilistic, combining a level of confidence with a margin of error. (The statistical calculations are far more complex than that, of course.)

While two numbers are arguably twice as complicated as one, confidence and error probabilities help non-technical decisionmakers:

In fact, there are several reasons why data science isn't an exact science, some of which are described below.

"When we're doing data science effectively, we're using statistics to model the real world, and it's not clear that the statistical models we develop accurately describe what's going on in the real world," said Ben Moseley, associate professor of operations research at Carnegie Mellon University's Tepper School of Business. "We might define some probability distribution, but it isn't even clear the world acts according to some probability distribution."

You may or may not have all the data you need to answer a question. Even if you have all the data you need, there may be data quality problems that could cause biased, skewed, or otherwise undesirable outcomes. Data scientists call this "garbage in, garbage out."

According to Gartner, "Poor data quality destroys business value" and costs organizations an average of $15 million per year in losses.

If you lack some of the data you need, then the results will be inaccurate because the data doesn't accurately represent what you're trying to measure. You may be able to get the data from an external source but bear in mind that third-party data may also suffer from quality problems. A current example is COVID-19 data, which is recorded and reported differently by different sources.

"If you don't give me good data, it doesn't matter how much of that data you give me. I'm never going to extract what you want out of it," said Moseley.

It's been said that if one wants better answers, one should ask better questions. Better questions come from data scientists working together with domain experts to frame the problem. Other considerations include assumptions, available resources, constraints, goals, potential risks, potential benefits, success metrics, and the form of the question.

"Sometimes it's unclear what is the right question to ask," said Moseley.

Data science is sometimes viewed as a panacea or magic. It's neither.

"There are significant limitations to data science [and] machine learning," said Moseley. "We take a real-world problem and turn it into a clean mathematical problem, and in that transformation, we lose a lot of information because you have to streamline it somehow to focus on the key aspects of the problem."

A model may work very well in one context and fail miserably in another.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Data analytics is on trend with fashion houses

12 Apr, 2017

Fashion retailers are increasingly turning to data analytics to keep up with the latest trends and client demands. As well …

Read more

What it will take for IoT to grow

20 Jul, 2017

After I read Brian Bailey’s IoT semiconductor design article, IoT Myth Busting, I thought of Prince’s song 1999, in particular, …

Read more

Must-Know Data Strategy Priorities for CIOs

10 Feb, 2023

Today’s data strategy revolves around four key initiatives, including data democratization and data orchestration. Data is the essence of any …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.