MongoDB, Hadoop And The Democratization Of Data

 

When I was in engineering school and wanted to get some serious data crunching done (shout out to Patran/Nastran aficionados), I would go downstairs to the lab and chat up Harvey. He owned the interface to the powerful and expensive mainframe, and nothing was going to get slotted in or processed without the blessing of this high priest of what, at the time, was Big Data. As much as I liked Harvey, boy am I glad that times have changed. But the advent of better number-crunching technology hasn’t happened overnight.

First Wave -- Easier Data Access

The democratization of data access within the business has been years in the making. Whereas 20 years ago there were a limited number of databases that were often run on large systems and had high barriers/licenses to start using (by people with advanced SQL training), the world has been evolving in a number of different directions. With the advent of MySQL in the mid-1990s, it became free and easy to get started on storing data, even with a minimum of relational database knowledge. MySQL then went on to power much of the website revolution in the late 90s.

Second Wave -- Cost Effective, Powerful Processing

Just as the first wave was starting to build, the need for a second wave was already starting. While it became easier to stand up and start a website and its underlying database, the explosive growth of the internet was starting to lead to other issues. Trying to find ways to index and search all the content being created was a daunting task. The major search engines such as Yahoo and Google were struggling with traditional ways of searching stacks of data. So instead of indexing every piece of hay in the proverbial haystack, they found ways to break it up and “mapreduce” it over several batches. The foundations for Hadoop came out of these efforts along with the work on Nutch by Doug Cutting.

Wave Three – The Bridge to Easy and Powerful

While cost effective and easy access sound great, this democracy presents opportunities and challenges since most existing data is already on legacy SQL systems. On the one hand, there are powerful new ways to combine NoSQL, SQL, and Hadoop for new areas such as IoT, as Matt Asay points out. On the other hand, there is now a modern Data Supply Chain (as Dan Woods notes) that must be put into place to manage all of this. That complexity can be intimidating for data architects, who only a decade ago were focused primarily on SQL and didn’t have to piece together NoSQL and Hadoop as well.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

An essential comparison between In-Memory Database vs. In-Memory Data Grid

24 Jun, 2014

  There’s a new record holder in the world of “big data.” In-memory computing is comprised of two main categories: …

Read more

The Future of Wearables Isn’t a Connected Watch

31 Jan, 2015

  Tech that attaches to our bodies doesn’t have to do it all. It just has to do one thing …

Read more

SAP previews new analytics tools for IT, business users

11 Jun, 2015

  SAP has made no secret of its desire to be front and center in the trend toward easy-to-use analytics …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.