To compete effectively, large organisations need to extract actionable insights from all their data faster and more accurately than ever.
However, they are usually hampered by complex IT estates which make the untangling of this data spaghetti expensive and difficult, typically the data is taken from 20 to 100 different platforms.
The result? Datasets fail to reconcile, new data feeds take months to establish and different departments duplicate one another’s work (inconsistently).
So confidence in the quality of data is low, hindering business action from insights.
Analysts entering a data warehouse may find as many as 5 or 6 different versions of a metric they are seeking, since the analyst does not know which version to trust or is most suitable, he creates yet another metric, so typically 80% of his time is eaten up wrangling data – making it usable.
>See also: Top 8 trends for big data in 2016
A massive source of duplication and a waste of valuable time that should have been spent producing actionable insights.
An organisation operating in this environment has no single source of the truth, leading to mistrust as departments build their own data marts in order to get on with their own work.
At the same time, projects frequently over-run both in terms of cost and time, stacking up overheads.
Metadata is the key
Now, a new approach is enabling businesses to save millions of pounds by integrating data faster, better and more cheaply than ever, irrespective of technology.
This is the metadata driven estate (MDE), generating and authenticating insights through the use of metadata in a managed hub that spans all platforms.
In simple terms this means that MDE unpicks the complexity of the “spaghetti” to produce data lineage showing exactly where a metric has come from and how it is calculated, whether it is in any sense polluted and who is using it.
MDE gives the analyst an easy to use search box to find all available metrics appropriate to his subject, the lineage and usage of each so that he easily decides which to choose.
MDEs automation ensures all the data is quality-managed tracking DQ issues on each data refresh, this improves confidence among analysts and end-users, across all data management platforms including Hadoop.