Three Key Steps For Manufacturers To Realize Their Big Data Age Dreams

The future of big data federation may have just landed

The future of big data federation may have just landed

 

Big data is no longer a war between batch and streaming data processing. Getting data from disparate data stores and running analytics on them in Real-Time is a huge technological challenge. Cracking this data federation problem has become a Holy Grail of sorts.

Cracking this data federation problem has become a Holy Grail of sorts. And while there are two primary approaches to cracking this problem today, a third has emerged that just might offer the most promise.

One of the ongoing barriers to greater big data adoption is the complexity of the associated software. The industry really needs to tackle this, offering end users the ability to query whatever data they want, wherever it is, no matter what format, and all without going through IT.

Which is mostly impossible today.

There are two general approaches used for data federation, both with their strengths and weaknesses.

Database-Centric approach

This first database-centric approach, is used by relational database (RDBMS) vendors like Teradata (QueryGrid) and IBM (FluidQuery) or by specialty technologies like the former Composite Software.

One of the biggest problems with such database-centric tools is that they're geared for DBA-type users, not business users and analysts. Further, these tools generally do not cover all types of big data. Most were designed for data that fits into tables and columns, but search, streams, and semi-structured or unstructured data (for which NoSQL databases are well-suited) do not necessarily fit as well.

Read Also:
The 5 Major Players in Enterprise Big Data Management

In addition, performance can sometimes be an issue when attempting to perform speed-of-thought analytics on a traditionally-federated source.

Query Tool-Centric approach

The query tool-centric approach, is used by Tableau, Qlik, and others.

These technologies do allow end users to mashup multiple sources, but they may not scale to big data volumes, as data is mashed up often on the user's desktop computer or web browser rather than in a scalable big data backend like Apache Spark.

And again, they were not really designed for the variety of big data sources and for anything beyond fairly trivial low-cardinality mashups.

The New Kid On The Block

The New approach to Data federation is coming from Zoomdata, who just announced its Fusion product, with an early access program to give companies a taste. Zoomdata claims Fusion can make multiple data sources appear as one source without moving or transforming data.

If it works as advertised, this would allow a business user to define a fused data source without waiting for a data architect to set it up ahead of time. Without resorting to a command line, Fusion is exposed as a simple drag-and-drop user interface that hides the underlying Spark-based infrastructure that combines datasets in ways hitherto impossible.

Read Also:
Nigerian Health Tourism and the Part Data Ought to Play

While interesting in itself, the real power comes from Zoomdata's ability to push as much as the processing to each underlying data platform as possible, based on the capabilities and performance profile of those systems, and use Spark to do the rest of the work that can't or shouldn't be pushed down.

That's really why this technology never really worked well before, since no one knows exactly the right questions to ask ahead of time, plus it was often really slow to actually run federated queries.

The Zoomdata approach is the exact opposite. It allows users to hook up their own data and run queries with fast results. That ability to truly iterate on big data—historical AND real-time data, enterprise, and cloud data—can be transformative to a company.



HR & Workforce Analytics Summit 2017 San Francisco

19
Jun
2017
HR & Workforce Analytics Summit 2017 San Francisco

$200 off with code DATA200

Read Also:
Informatica announces industry’s first intelligent healthcare data lake
Read Also:
Benchmarks to prove you need an analytical database for Big Data

M.I.E. SUMMIT BERLIN 2017

20
Jun
2017
M.I.E. SUMMIT BERLIN 2017

15% off with code 7databe

Read Also:
SAP previews new analytics tools for IT, business users

Sentiment Analysis Symposium

27
Jun
2017
Sentiment Analysis Symposium

15% off with code 7WDATA

Read Also:
9 Questions Data Center CEOs Must Ask about Revenue Generation

Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

28
Jun
2017
Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

15% off with code 7WDATA

Read Also:
Study: Most organizations still struggle to mine Big Data

AI, Machine Learning and Sentiment Analysis Applied to Finance

28
Jun
2017
AI, Machine Learning and Sentiment Analysis Applied to Finance

15% off with code 7WDATA

Read Also:
SAP previews new analytics tools for IT, business users

Leave a Reply

Your email address will not be published. Required fields are marked *