Know when your big data is telling big lies

Know when your big data is telling big lies

Know when your big data is telling big lies

Data scientists use statistical analysis tools to find non-obvious patterns in deep data. But they know the universe is full of spurious correlations. Big data simply intensifies the problem.

Because, as the range of sources and the diversity of predictors continues to grow, the number of relationships that can potentially be modeled begins to approach infinity. As David G. Young pointed out, “predictive variables sometimes aren’t ....We’ve all seen variable interactions that change the significance, curvature, and even the sign of an important predictor.”

Thus, if you’re looking for a particular correlation in your data, you can probably find it if you’re clever enough to combine only the right data, specify only the right variables, and analyze at using only the right algorithm. Once you’ve hit on the right combination of modeling decisions, the patterns you seek may pop out like a genie from Aladdin’s lamp.

Yet the fact that you’ve supposedly discovered this correlation doesn’t mean it actually exists in the underlying real-world domain you’re investigating. It may simply be a figment of your specific approach to modeling the data you have at hand. You may have no fraudulent intent, and you may otherwise adhere to standard data-scientific methodologies, but you may choose to go no further if it appears you’ve already struck the pay dirt insight you were seeking.

Read Also:
Guide To Budget Friendly Data Mining

If you’re a data scientist, the fact that you don’t realize you’re looking at non-existent statistical patterns may simply stem from the fact that you’re human.

 



Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
4 Steps to Designing and Launching an Effective Data Product

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
Push Your Analytics Out to Customers

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Machine learning: The big disruptor, says Ovum

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
4 Steps to Designing and Launching an Effective Data Product

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
Push Your Analytics Out to Customers

Leave a Reply

Your email address will not be published. Required fields are marked *