Download this white paper to see how your organization can take advantage of the new data landscape by integrating Apache Hadoop with your EDW, brought to you in partnership with Hortonworks.
This is part two of a two-part series from Hadoop Summit. In his post, Rob Beardon talks about how data transforms everything and the need for Connected Data Platforms. As a follow on, here’s four predictions for technologies behind this transformation.
We all accept that number of connected devices and sensors will continue to grow. The volume, variety, and complexity of data will continue to explode alongside it. Currently data is doubling every few years. A vast set of public, private, and hybrid data clouds are emerging.
The need for machine-to-machine (or peer-to-peer) connectivity that the Internet of Anything (IoAT) is defining two things: The need for ‘an interface of things’ and the need for types of intelligent environment where devices that can understand each other can work together in real-time, in the context of a larger need or purpose. It also requires the ability for networks to expand and contract on demand as well as for messages to be routed and prioritized in real time.
In this world, silos of data will be replaced by clouds of data. Freed from the need to conform to the rules of data center batch processing, we will see intelligent self-configuring networks that can enable the meaningful connections between devices, and provide the flexibility to enable new and faster delivery of data to the right place to be analyzed.
An integrated model where data and analytics flow seamlessly is key here. Keep in mind a key notion with the advent of connected devices, our products now create, consume and use data as well and are a key part of the end-to-end business flow.
This new reality with connected customers, products, and supply chains demands real time machine learning, intra-system collaboration and analytics at the edge. No longer is the world going to accept ‘centralized-only’ monolithic software and silos of data. Most ‘smart’ devices will collaborate to varying degrees and be able to analyze what each other are saying. Real time machine-learning algorithms within modern distributed data applications will come into play. Algorithms that are able adjudicate ‘peer-to-peer’ decisions in real time.
Data has gravity; it’s still expensive to move versus store in relative terms. This will spur the notion of processing analytics out at the edge, where the data was born and exists, in real-time versus moving everything into the cloud or back to a central location. Plus we’ll need to keep track of all of these machine-to-machine conversations and what happened as a result in order to build better and more intelligent models and distributed solutions over time.
We all hear about self driving cars and trucks. Just imagine what an impact they will have on lives. Whether it’s better safety, better fuel efficiency, smoother traffic management, higher fuel consumption, or cheaper public transportation and movement of goods…the personal and economic implications are far reaching.
For autonomous cars not to run into each other, they actually have to communicate with each other directly as well as with the cloud. They can’t wait to go back to some place in the cloud, run an algorithm, and wait for the result to come back. They have to talk to each other with a level of intelligence, and so peer-to-peer and edge analytics will come into play.