Big:  Data

Big: Data, model, quality and variety

Big:  Data, model, quality and variety
The “big” part of big data is about enabling insights that were previously indiscernible. It’s about uncovering small differences that make a big difference in domains as widespread as health care, public health, marketing and business process optimization, law enforcement and cybersecurity – and even the detection of new subatomic particles.

But the “bigness” of your data is not its most important characteristic. Here are three other considerations when it comes to getting value from big data.

Of the three “V’s” of big data (volume, velocity and variety), the best advice for many organizations is to forget about big volume. For my money the real value in big data comes from its variety.

Consider this example from the natural sciences – the discovery and eventual acceptance of plate tectonics. First proposed as the theory of Continental Drift by Alfred Wegener in 1912, it was not until the 1960’s that it was fully accepted based on the overwhelming data-driven evidence acquired across a wide variety of fields:

Read Also:
How Machine Learning Makes Databases Ready for Big Data

Getting value out of your variety is first and foremost a data integration task. Don’t let your big data become Big Silos. Start within a function, like production or marketing, and integrate those data silos first.

For example, in customer service, bring together the separate web, call center and field service data. The next step is to integrate your more far flung disparate systems – valuable insights arise when you’ve got a holistic view of customer and product attributes along with sales data by channel, region and brand.

The value from data integration grows exponentially with each additional data source. Big variety is the future of big data.

With all the hype over big data, we often overlook the importance of modeling as its necessary counterpart. There are two independent limiting factors when it comes to decision support: The quality of the data, and the quality of the model.

Most of the big data hype assumes that the data is always the limiting factor, and while that may be the case for the majority of projects, I’d venture that bad or inadequate models share more of the blame than we care to admit.

Read Also:
Cloudera Corporate Overview - Driving Big Value with Big Data [Video]

It’s a balancing act, between the quantity and quality of our data, and the quality and fit-for-purposeness of our models, a relationship that can frequently get out of balance.

In one instance we may have remarkable models starved for good data, and on the other hand, volumes of sensor or customer data sit idle with no established approach for exploration, analysis and action.

“Recognizing that all of our decisions are based on our models of reality, not reality itself, is a key to understanding decision making. Too many individuals concentrate their efforts on perfecting “the data” that they then proceed to process through models that have little or no semblance of reality.

Read Full Story…

 

Leave a Reply

Your email address will not be published. Required fields are marked *