MongoDB, Hadoop And The Democratization Of Data

MongoDB, Hadoop And The Democratization Of Data

 

When I was in engineering school and wanted to get some serious data crunching done (shout out to Patran/Nastran aficionados), I would go downstairs to the lab and chat up Harvey. He owned the interface to the powerful and expensive mainframe, and nothing was going to get slotted in or processed without the blessing of this high priest of what, at the time, was Big Data. As much as I liked Harvey, boy am I glad that times have changed. But the advent of better number-crunching technology hasn’t happened overnight.

First Wave -- Easier Data Access

The democratization of data access within the business has been years in the making. Whereas 20 years ago there were a limited number of databases that were often run on large systems and had high barriers/licenses to start using (by people with advanced SQL training), the world has been evolving in a number of different directions. With the advent of MySQL in the mid-1990s, it became free and easy to get started on storing data, even with a minimum of relational database knowledge. MySQL then went on to power much of the website revolution in the late 90s.

Read Also:
The business advantages of embedding analytics into applications

Second Wave -- Cost Effective, Powerful Processing

Just as the first wave was starting to build, the need for a second wave was already starting. While it became easier to stand up and start a website and its underlying database, the explosive growth of the internet was starting to lead to other issues. Trying to find ways to index and search all the content being created was a daunting task. The major search engines such as Yahoo and Google were struggling with traditional ways of searching stacks of data. So instead of indexing every piece of hay in the proverbial haystack, they found ways to break it up and “mapreduce” it over several batches. The foundations for Hadoop came out of these efforts along with the work on Nutch by Doug Cutting.

Wave Three – The Bridge to Easy and Powerful

While cost effective and easy access sound great, this democracy presents opportunities and challenges since most existing data is already on legacy SQL systems. On the one hand, there are powerful new ways to combine NoSQL, SQL, and Hadoop for new areas such as IoT, as Matt Asay points out. On the other hand, there is now a modern Data Supply Chain (as Dan Woods notes) that must be put into place to manage all of this. That complexity can be intimidating for data architects, who only a decade ago were focused primarily on SQL and didn’t have to piece together NoSQL and Hadoop as well.

Read Also:
An essential comparison between In-Memory Database vs. In-Memory Data Grid



Sentiment Analysis Symposium

27
Jun
2017
Sentiment Analysis Symposium

15% off with code 7WDATA

Read Also:
Protecting Privacy Is Good For Business

Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

28
Jun
2017
Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

15% off with code 7WDATA

Read Also:
Think Managing Big Data Is Much Too Complex? Just Wait

AI, Machine Learning and Sentiment Analysis Applied to Finance

28
Jun
2017
AI, Machine Learning and Sentiment Analysis Applied to Finance

15% off with code 7WDATA

Read Also:
Protecting Privacy Is Good For Business

Real Business Intelligence

11
Jul
2017
Real Business Intelligence

25% off with code RBIYM01

Read Also:
MongoDB CTO Discusses The Database Evolution

Advanced Analytics Forum

20
Sep
2017
Advanced Analytics Forum

15% off with code Discount15

Read Also:
Embracing Aspects of Analytics Automation

Leave a Reply

Your email address will not be published. Required fields are marked *