It’s actually not so new—Gartner coined the term a couple years back—but dark data is finally starting to catch on in market research circles and it represents a huge untapped opportunity for insights!
Gartner defined dark data as “the information assets organizations collect, process, and store during regular business activities, but generally, fail to use for other purposes.”
The definition has since expanded to encompass not just internal data, but the broader spectrum of data that are readily available to organizations.
The common denominators are 1) these data are largely unstructured and 2) they are not being analyzed. In fact, according to IDC, 90% of the unstructured data are never analyzed!
Why Search in the Dark?
Maybe you’ve heard this one?
A police officer comes upon a man crawling around on all fours under a streetlight one evening.
The man explains that he’s looking for his wallet.
“Where do you think you lost it?” asks the policeman.
“Across the street, but the light is so much better here,” says the man.
Popular among data scientists, I think this joke illustrates the irrationality of a lot of common thinking in research these days. We tend to search for insights in a relatively limited but easily accessible location—survey data—as if the only answers to be found must be there.
And even that relatively small pond isn’t being thoroughly fished. As I’ve blogged in the past, for most of us, even survey open-ends/comment data are still “dark data”!
At the risk of deluging you with metaphors, the fact remains that what we can find in our survey data is only the tip of the insights iceberg.
We have at our disposal all manner of unstructured data for which text analytics are uniquely suited to organize and understand, including images and video—without any enrichment or visual content analysis. For example, images often contain file name and metadata descriptions in text format that can be analyzed with software like OdinText. Videos, too, often contain transcript data, and there are technologies like YouTube’s, which can handle audio-to-text translation.
A Few Things to Consider
Dark data can be Big Data. And very Big Dark Data can prove daunting (that’s partly why it stays dark in the first place).
But dark data can also be quite small we’ve found.
And just as Big Data isn’t necessarily valuable just because it’s big, dark data certainly isn’t valuable just because it’s dark.