A modernized approach to data lake management

A modernized approach to data lake management

In my last post, I started to look at the use of Hadoop in general and the data lake concept in particular as part of a plan for modernizing the data environment. There are surely benefits to the data lake, especially when it's deployed using a low-cost, scalable hardware platform. The significant issue we began to explore is this: the more prolific you become at loading data into the data lake, the greater the chance that entropy will overtake any attempt at proactive management.

Let's presume that you plan to migrate all corporate data to the data lake. And the idea of the data lake is to provide a resting place for raw data in its native format until it's needed. Now, let’s imagine what you need to know when you decide that the data truly is needed:

In other words, you need to know a lot about that data. And here is the most confusing part: you may not even know which data is the data you want! That is part of the promise of the data lake – data is kept around until someone needs it, and it's up to the data consumer to determine what data they need, when they need it.

In reality, the simplistic approach to the data lake just won’t work. You need a means for creating a catalog of the data in the data lake so that data consumers have a way to browse through the inventory of data assets to determine which are usable for a particular application or analysis.

 

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Three Key Performance Indicators for Achieving Data Security in 2017

1 May, 2017

As in previous years, the first quarter is the time for prognostications. Publications and social media are flooded with articles …

Read more

The Future of Computing: How Brain-Computer Interfaces Will Change Our Relationship with Computers

26 Oct, 2021

Ever since Elon Musk’s Neuralink showcased the monkey Pager controlling a game with its mind, Brain-Computer Interfaces (BCIs) came to …

Read more

Overcome These 5 Challenges to Manage Data Overload

6 Dec, 2016

As companies are increasingly recognizing, data is the new currency in business. Enterprises harness the power of their data and …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.