6 myths about big data

6 myths about big data

6 myths about big data

Advances in cloud computing, data processing speeds, and the huge amount of data input from sources like IoT mean that companies are now collecting previously unseen amounts of data. Big data is now bigger than ever. But organizing, processing, and understanding the data is still a major challenge for many organizations.

Is your company still struggling to understand what big data is, and how to manage it? Here are 6 myths about big data, from the experts, to help you separate truth from fiction in the realm of big data.

Big data is a buzzword these days. But what it really means is still often unclear. Some people refer to big data as, simply, a large amount of data. But, that's not quite correct. It's a little more complex than that. Big data refers to how data sets, either structured (like Excel sheets) or unstructured (like metadata from email) combine with data like social media analytics or IoT data to form a bigger story. The big data story shows trends about what is happening within an organization—a story that is difficult to capture with traditional analytic techniques.

Read Also:
7 Keys To Building A Successful Big Data Infrastructure

Jim Adler, head of data at Toyota Research Institute, also makes a good point: Data has a mass. "It's like water: When it's in a glass, it's very manageable. But when it's in a flood, it's overwhelming," he said "Data analysis systems that work on a single machine's worth of data will be washed away when data scales grow 100 or 1000 times. So, sure, prototype in the small, but architect for the large."

"The biggest myth is you have to have clean data to do analysis," said Arijit Sengupta, CEO of BeyondCore. "Nobody has clean data. This whole crazy idea that I have to clean it to analyze doesn't work. What you do is, you do a 'good enough' analysis. You take your data, despite all the dirtiness, and you analyze it. This shows where you have data quality problems. I can show you some patterns that are perfectly fine despite the data quality problems. Now, you can do focused data quality work to just improve the data to get a slightly better insight."

Megan Beauchemin, director of business intelligence and analytics for InOutsource, agreed. "Often times, organizations will put these efforts on the back burner, because their data is not clean. This is not necessary. Deploying an analytic application will illuminate, visually, areas of weakness in data," she said. "Once these shortfalls have been identified, a cleanup plan can be put into place. The analytic application can then utilize a mechanism to highlight clean-up efforts and monitor progress."

Read Also:
Customer data without integration is hardly data at all

"If your data is not clean, I think that is all the more reason to jump in," Beauchermin said. "Once you tie that data together, and you're bringing it to life visually in an application where you're seeing those associations and you're seeing the data come together, you're going to very quickly see shortfalls in your data." Then, she said, you can see where the data issues lie, offering a benchmark as you clean the data up.

Here's another reason you shouldn't wait to clean up your data: "By the time you've cleaned your data, it's three months old—so you have stale data," said Sengupta. So, the information is no longer relevant.

Sengupta spoke about a conference where Josh Bartman, from the First Interstate Bank, brought up an important point. "Josh showed how he was running an analysis, finding a problem, changing the analysis, rerunning the analysis. He said, 'Look, my analyses are only about four to five minutes apart.

Read Also:
Business Transformation Demands Modern Data Integration

 



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Customer data without integration is hardly data at all

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
7 Ways To Improve Your Business Using Business Analytics

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
Why the healthcare industry is hacking graphics technology to power machine intelligence

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
3 Advantages of Using an Open Source Business Intelligence Tool

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
6 Ideas to Help Government Realize Open Data

Leave a Reply

Your email address will not be published. Required fields are marked *