I got 99 data stores and integrating them ain't fun

I got 99 data stores and integrating them ain’t fun

I got 99 data stores and integrating them ain’t fun

You know the story: corporation grows bigger and bigger through acquisitions, organograms go off the board, IT assets skyrocket, everyone is holding on to theirs trying to secure their roles, complexity multiplies, chaos reigns and the effort to deliver value via IT becomes painstaking. This may help explain why a lead enterprise architect in one of Europe's biggest financial services organizations is looking for solutions in unusual places.

Let's call our guy Werner and his organization WXYZ. Names changed to protect the innocent, but our fireside chat in Semantics conference last week was real and indicative of Data integration pains and remedies. WXYZ's course over the years has resulted in tens of different data stores that need to be integrated to offer operational and strategic analytic insights. A number of initiatives with a number of consultancies and vendors have failed to deliver, budgets are shrinking and personnel is diminishing.

Granted, a big part of this has nothing to do with technology per se, but more with organizational politics and vendor attitude. But when grandiose plans fail, the ensuing stalemate means that in order to move forward a quick-win is needed, one that will ideally require as little time and infrastructure as possible to work, can be deployed incrementally and scale as required eventually. A combination of well-known concepts and under the radar software may offer a solution.

Read Also:
Is machine learning the next commodity?

What do you do then, when you cannot afford to build and populate a data lake, or yet-another-data-warehouse? Federated querying to the rescue. This means that data stay where they are, queries are sent over the network to different data sources and overall answers are compiled by combining results. The concept has been around for a while and is used by solutions like Oracle Big Data. Its biggest issues revolve around having to develop and/or rely on custom solutions for communication and data modeling, making it hard to scale beyond point-to-point integration,

Could these issues be addressed? Data integration relies on mappings between a mediated schema and schemata of original sources, and transforming queries to match original sources schema. Mediated schemata don't have to be developed from scratch -- they can be readily reused from a pool of curated Linked Data vocabularies.

 



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Limiting Damage from Data Breaches
Read Also:
Data Increasingly Used As a Force for Good

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
4 Business Risks Preventing Big Data ROI

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
How Patient Safety Organizations Encourage Data Collection, Quality Care

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
3 ways data analytics can reduce total healthcare costs

Big Data and Analytics Marketing Summit London

12
Jun
2017
Big Data and Analytics Marketing Summit London

$200 off with code DATA200

Read Also:
Through the Healthcare Industry, a Look at Digital Transformation

Leave a Reply

Your email address will not be published. Required fields are marked *