How big data delivers data driven stories

How big data delivers data driven stories

Sourcing a data-driven story is a complicated process. The magnitude of the information that is now available in even medium-sized datasets makes it difficult to know exactly what vein of information contains the most impressive or significant story, no matter how good a person may be at pattern recognition or spotting trends. Even if you believe you have found something insightful and original, there might be an even more interesting take on the same data that was only visible when the data is viewed at its most granular level.

Databases and data visualisation tools have become invaluable when finding the narratives in big data; the technical constraints of such tools mean that the larger a dataset is, the further the data needs to be shrunk to become manageable and to reduce processing time. This sacrifices crucial data granularity to the extent that interesting stories which rely on high levels of detail are lost. 

Read Also:
Big Data Misconceptions

So, what can be done? Let’s look at a real-life example of how all of the data can come into play when you have the right tools available, and why the most detailed data can be the most valuable.

In November 2016, InterWorks’ Tableau Zen Master Robert Rouse took part in the ‘Iron Viz’ competition with several other experts at Tableau’s worldwide customer and partner conference. The challenge was for each contestant to demonstrate the best use of Tableau when creating a data driven story, with each contestant drawing from the same 14 Gigabytes and 161 million rows of business data detailing New York taxi journey details. Robert analysed how snowfall and public holidays affected trip counts and overall taxi fares, with an emphasis on visualising the effect that snow days had on New York taxi fares. The contest provided a perfect demonstration of Tableau’s utility when creating data-driven stories through visualisations.   

Read Also:
Data Lakes, Explained

For the competition, Robert worked from a reduced data set created by taking a daily aggregate of trips taken and fares earned over the year of 2014. For the map visualisation, the dataset was further shrunk to a three-day period, chosen to best highlight trip frequency changes that a single snow day caused across New York. The dataset needed to be reduced this much because it was simply not possible to address the complete dataset within the visualisation tool, and this diluted dataset lost much of its granularity and detail.

Robert chose to re-run the analysis from the competition for a webinar, again in order to demonstrate the effectiveness of visualising data, but this time the dataset would be handled by EXASOL’s in-memory analytic database instead of Tableau data extracts. In this re-run, Robert showcased the level of work required when identifying and extracting a story from such a large dataset, and demonstrated how by using a powerful analytic database he could draw from the complete dataset.

Read Also:
Expert View: How To Implement A Data-Driven Culture


Sentiment Analysis Symposium

27
Jun
2017
Sentiment Analysis Symposium

15% off with code 7WDATA

Read Also:
Data Lakes, Explained

Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

28
Jun
2017
Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

15% off with code 7WDATA

Read Also:
Big Data Misconceptions

AI, Machine Learning and Sentiment Analysis Applied to Finance

28
Jun
2017
AI, Machine Learning and Sentiment Analysis Applied to Finance

15% off with code 7WDATA

Read Also:
The Value of Data Protection in the Internet of Things

Real Business Intelligence

11
Jul
2017
Real Business Intelligence

25% off with code RBIYM01

Read Also:
How to Dump Jargon and Really Use Business Intelligence

Advanced Analytics Forum

20
Sep
2017
Advanced Analytics Forum

15% off with code Discount15

Read Also:
Predictive analytics: Better than surveys?

Leave a Reply

Your email address will not be published. Required fields are marked *