The rise of Big Data, and the industry’s IoT craze, are driving huge demand for streaming data analytics. There’s an impediment though: streaming data is hard to work with. 2016 will heighten the demand, and also the tension around the difficulty. It may also force a solution.
In the big-data era, businesses yearn to be data-driven in their decision making processes. Rather than act on hunch, they’d prefer to observe and measure, then make decisions accordingly. That’s a laudable standard – but it raises the bar. A culture driven by data, in turn, drives a desire for real-time, streaming data.
And if culture didn’t drive that desire, technology trends would. Analyzing Web logs in real-time can help drive multi-channel marketing initiatives with much more immediate impact. The extremely-hyped IoT – the Internet of Things – is all about the observation of ephemeral sensor readings, the quintessential streaming data. But the value of that data is ephemeral as well, making real-time, streaming analytics a necessity. In the consumer packaged goods world, for example, using sensors to monitor temperature and other physical attributes of manufacturing equipment helps assure things are running smoothly. These readings can also be fed into predictive models to prevent breakdowns before they happen, and keep the assembly lines running.
It’s exciting. The use cases and the demand for streaming are in place, and 2016 is poised to be the year when streaming analytics crosses over from niche to mainstream.
The rewards of being able to analyze data in real-time are huge, but the effort often has been as well. Working with streaming data isn’t like working with data at rest. It involves a paradigm shift, a different skill set, a different outlook, a change of mindset.
To understand why, we need only rewind a bit, to real-time data technology before the Big Data era. That category of software, known as complex event processing, or CEP, set the precedent for streaming data and difficulty going hand-in-hand.
Back then, data was smaller and accepted latencies were higher. That meant CEP was niche, allowing it to be difficult, expensive and inconsistent with other data technology.
Querying data at rest involves an architecture and approach that has been with us for more than 20 years: find a connector/driver/provider that can talk to your database, feed it a SQL query, and get your results back as a set of rows and columns. This is a pattern with which virtually every technologist is familiar. It’s a universal, standard, shared concept.
But with streaming data, you need to work with data structured as “messages.” Messaging systems work on the premise of “publishing” small bit of data to queues, to which other systems can “subscribe.” They are thus often referred to as pub/sub message buses. Message buses don’t work like databases.;