Orphaned data — information that companies collect and store with the intention of analyzing it at some future point — may have a name that tugs at your heartstrings but more likely than not, orphan data is obsolete and dragging on your organization’s purse strings instead.
That’s because orphaned data is particularly expensive to maintain according to a 2016 report from the Data Genomics Project. Orphaned data accounts for about 1.6 percent of the total enterprise file population but hogs about 5.1 percent of valuable enterprise storage capacity, all while eating up overhead and inflating IT operating costs.
What’s more, orphaned data is ownerless and stale: More than 40 percent of corporate data hasn’t been modified in at least three years and about 12 percent hasn’t been touched in the last seven years.
Yet no organization sets out to create or hoard orphan data.
More often than not, what happens is that data is captured with the goal of delivering perishable insights from real-time analytics — think clicks on a website or geo locations from mobile customers —but that data is only useful while that customer is making a purchase.
If action and analysis do not come together in that single moment, that data has lost its value as real-time information. And weeks, months — even years can go by without an organization even realizing that its orphan data is there.
What’s an organization to do?
For starters, enterprise data needs to be classified and tagged, not simply squirreled away in storage to be quickly forgotten. Through classification and tagging, some orphan data can be repurposed as historical data and re-analyzed to derive management insights to improve business results.
Once classified and tagged, useless orphan data can be pulled permanently from the data lake while still-valuable data can be thrown back to play a more useful role in the organization’s big data ecosystem.