Why our over-reliance on big data shows that we don't trust ourselves

Why our over-reliance on big data shows that we don’t trust ourselves

Why our over-reliance on big data shows that we don’t trust ourselves

Is data the modern oracle, the oil that will power the next industrial revolution—or just another round of business hype?

Of course it's true that there is more of the stuff, more information in forms that computers can collect and process, than ever in human history. Even trying to quantify it is a fool's errand, when yesterday’s "biggest dataset in the world" becomes today's portable hard drive. But there is more to it than size. After years of talking to people who use big data in fields from dating apps to finding the Higgs Boson, I managed to reverse-engineer my analysis into a handy acronym—DATA.

D is for dimensions, or diverse, or different datasets. By combining very different types of information, we can get new insights. Brain scans alone are informative, but combine them with health records, postcodes and weather reports, and you can test a hypothesis that vitamin D intake affects the progression of multiple sclerosis, for example.

Try Newsweek for only $1.25 per week

A is for automatic. We do so many things through our digital devices, phones or computers or wearables that collecting data is now the default. Every time you touch into a transport system, or pay with a bank card, or connect to a wifi network, you're adding to somebody's database. Much of the processing of that data is also automatic, invisible, opaque.

Read Also:
Tech jobs report: Security, devops, and big data stay hot

T is for time. Data streams into the databases almost in real time, making it easy to spot emerging patterns, and then to project that timeline into the future. Not just obvious things like traffic flows, but adding “sentiment analysis” of our social media activity to sales records and weather forecasts to predict the first big barbecue weekend of the year.

A is for AI, artificial intelligence. That's what spots the patterns in the tsunami of numbers. Yes, computers can calculate faster and more accurately than any human, but by using machine learning they do far more. Through trial and error, software modeled on aspects of how humans learn can sort images like brain scans (male/female or healthy/diseased) or more complex documents like job applications.

And this is where the dilemmas start to emerge.

The idea is that, unlike a biased human recruiter, a hiring algorithm will go on objective data. It won't take into account categories of human prejudice like race or gender. And if any disgruntled applicant disputes your hiring decision, you can claim that you followed procedure to the letter.

Read Also:
Balancing the Demands of Big Data With Those Of Accurate Data

Even if it turns out that the algorithm got it wrong when the new employee runs off with all the company's cash, at least you won't have to carry the can. You followed procedure, didn't you? Is it your fault if this candidate was the 1 percent, the exception that proves the rule is probabilistic, not absolute?

But what if you are the other 1 percent, the applicant whose scores are lousy, for reasons over which you have no control, but who would make the best employee if somebody would just give you the chance?

Say you live in the wrong part of town, too far from the workplace. Or you've had a lot of time off sick lately. Or your friends tagged you in a Facebook photo with a jokey reference to smoking weed.

 



Data Innovation Summit 2017

30
Mar
2017
Data Innovation Summit 2017

30% off with code 7wData

Read Also:
Cloud transition is critical to digital transformation
Read Also:
Artificial Intelligence and the Evolution of a Smarter Internet

Big Data Innovation Summit London

30
Mar
2017
Big Data Innovation Summit London

$200 off with code DATA200

Read Also:
Transform Your Business With IoT Analytics

Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
Five data quality lessons from Amazon

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
Tech jobs report: Security, devops, and big data stay hot

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Artificial Intelligence and the Evolution of a Smarter Internet

Leave a Reply

Your email address will not be published. Required fields are marked *