Why Integration and Governance Are Critical for Data Lake Success

Why Integration and Governance Are Critical for Data Lake Success

Why Integration and Governance Are Critical for Data Lake Success

This is the final article in a three-part series exploring what it takes to build a data lake capable of meeting all the requirements of a truly enterprise-scale data management platform. While earlier installments focused on enterprise-scale data management in Hadoop, data onboarding into the data lake, and security, this article will focus on two things: Integrating the data lake within the broader enterprise IT landscape, and data governance.

As more lakes are deployed, we see patterns emerge for how data lakes are positioned relative to existing databases, data warehouses, analytic appliances, and enterprise applications in larger organizations.

Some data lakes are deployed from the outset as centralized system-of record data platforms, serving other systems in an enterprise scale, data-as-a-service model. As a centralized data lake builds momentum, collecting more data and attracting more use cases and users, its value grows as users collaborate on improving and reusing the data.

Other projects start at the edge of the organization to deliver data and meet the analytic needs of a specific business group. A localized data lake often expands to support multiple teams or spawn additional separate data lake instances to support other groups who want the same improved data access as the first group got.

Read Also:
15 Chief Data Officer Job Requirements

Regardless of what pattern the data lake takes as it lands and expands in the organization, the data lake’s increasing role in the organization brings with it new requirements for enterprise readiness.

To be enterprise-ready, the data lake needs to support a set of capabilities that allow it to be integrated within the company’s overall data management strategy and IT applications and data flow landscape.

Here are some requirements to keep in mind:

In addition to streaming the integration of your data lake, you must prepare the lake to support a broad and expanding community of business users.

As more users begin working with the data lake directly or through downstream applications or reporting/analytic systems, the importance of having strong data governance grows. This topic — data governance — is the final dimension of enterprise readiness.

By bringing together typically hundreds of diverse data sets in a large repository and giving users unprecedented direct access to that data, data lakes create new governance challenges and opportunities.

Read Also:
Why businesses are waking up to artificial intelligence

The challenges have to do with ensuring that data governance policies and procedures exist and are enforced in the lake.  Enterprise-ready data governance in the data lake starts with a clear definition of who owns or has custodial responsibility for each data asset as it enters the lake and as it is maintained and enhanced through the data lake process.

 



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Startups Need Business Analytics

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
Big Data Context: Targeting Relevant Data that’s Fit for Purpose

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
15 Chief Data Officer Job Requirements

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
Five data quality lessons from Amazon

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
The Much-Needed Business Facet for Modern Data Integration
Read Also:
The Much-Needed Business Facet for Modern Data Integration

Leave a Reply

Your email address will not be published. Required fields are marked *