What If We've Got Big Data and Analytics All Wrong?

What If We’ve Got Big Data and Analytics All Wrong?

What If We’ve Got Big Data and Analytics All Wrong?

Every once in a while I run into a little company that comes at an existing market as if the folks already in it are idiots -- and sometimes they are right. Here's the thing: What often happens is a company breaks out in a segment, and everyone groups around that company's ideas and emulates them.

Few initially stand up and say, "Wait a minute -- what if they're wrong?" Often we get so excited about building the market, we don't realize until much later that the initial attempt at solving a problem doesn't work. With big data and analytics, the typical project failure rate, depending on how you categorize the projects, can be as high as a whopping 80 percent.

The company I met with is Pneuron, and its approach is very different and, I think, way better -- at least when it comes to acquiring the data.

I'll wrap up with my product of the week: YouMail, an interesting free voicemail service that could solve our nasty robocall problem.

A few years back, I attended a talk by Harper Reed, the CTO of President Obama's reelection campaign. Most of the talk focused on how his team used analytics effectively to win, but in one part that stuck with me, he touched on why big data was stupid.

Read Also:
Why Civic Analytics?

Reed argued -- and I agreed -- that the focus on big data caused people to do stupid things -- like think of ways to aggregate and collect lots of data elements into mammoth repositories that then were too big and complex to analyze. He argued compellingly that the focus shouldn't be on collecting massive amounts of data, because that just creates a bigger potential problem. Rather, it should be on analyzing the data you have to obtain the answers to critical questions.

I think this explains why so many big data projects fail. Too much time and money is focused on collecting massive amounts of data from systems that never were designed to talk to each other. Consequently, by the time the giant repositories are built, the data is out of date, corrupted, and damn near impossible to analyze.

But what if you didn't collect the data in the first place?

What if you left the data where it was, analyzed it in place, and then aggregated the analysis? In other words, rather than aggregating the data, you would aggregate the information you needed from it.

Read Also:
Machine Learning is Disrupting Life Science Research – For Good

Taking that approach, you don't create huge redundant repositories, you don't experience the massive lag of having to move and translate data repositories, and you don't have the potential data corruption problems. What you do have is a fraction of the cost of any given project.

If done right, you'd end up with a higher probability of being both more accurate and more timely. Your hardware costs would be dramatically lower, and because you were solving the problem in components, you'd actually be able to start getting value before the project's conclusion.

With each additional repository, the solution would get smarter, as it would be able to answer questions not only from the new repositories, but also from those previously provided. That is basically what Pneuron does.

You'd have to be careful that bias didn't enter the process through the intermediate analysis, but the risk should be far lower than what naturally would occur when you slammed together data elements that came from very different systems and likely very different ages.

Read Also:
The Future of Cybersecurity

You massively simplify the problem you are trying to solve, and you are better set up for mergers and acquisitions.

An interesting side benefit to this approach is that typically when two companies merge, getting the systems to talk to each other is a nightmare.



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Data Science for IoT vs Classic Data Science: 10 Differences

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
Smart Cities Can Get More Out Of IoT, Gartner Finds

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
Simplify the way your business handles complex data

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
How Will Big Data Evolve in the Year Ahead?

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
The 3 Reasons Why Companies Should Use Data Intensive Computing

Leave a Reply

Your email address will not be published. Required fields are marked *