wpid-thumbnail-97ddfc6def1f3d30750bcb2828507f00-250x250

Can Big Data algorithms tell better stories than humans?

Can Big Data algorithms tell better stories than humans?

 

What if the computer algorithms could tell more compelling stories than journalists, writers or business analysts? Well, this is increasingly becoming a reality. A new generation of Big Data tools are being put to automate storytelling.

The ideas behind this application of analytics were first put to use generating automated news reports, covering sports and financial stories. Take the recent Wimbledon tennis championships as an example. The Slamtracker system developed by IBM monitors each game using sensors and cameras, generating millions of real-time data points covering speed of serve, forced and unforced errors, and even the social media sentiment surrounding each game. This data can then be turned into automated stories or Twitter messages to ensure Wimbledon are the first to break news stories about the results.

Already journalists have expressed worries that technology like that could put them out of a job. But the truth is, if it is possible to teach the process of structuring data into a narrative to a human, it can be taught to a computer too.

Kris Hammond, co-founder and chief scientist at Narrative Science, which has created the Quill natural language generation platform, realized early on that technology could be used turn information into easy to understand narratives. In fact, Quill is a regular contributor to Forbes–just like me. You can see its latest contributions here.

Quill, or competing apps like Automated Insights are used by other media outlets – but due to a lack of information over how trustworthy readers would consider reports created by algorithms, many news publishers may be reluctant to admit whether their stories, or parts of them, are generated by computers.

Read Also:
Big Data Moves Toward Real-Time Analysis

The implications of this technology go further than putting journalists out of work, however. In fact Hammond concedes Quill isn’t yet great at finding news stories–its strengths lie in putting stories together from specific data sources. Narrative Science is currently running one application which reads the stock market and attempts to spot when unusual highs, lows or volume spikes could have important implications, but Hammond calls this a “very controlled” instance of Quill digging up its own stories. He stands by his claim, made in 2012 that a computer would be able to write Pulitzer-prize quality journalism within five years–although he admits the clock is ticking!

No, the real value, Hammond says, is not in the scattershot approach of news publishing, where one article is created for a vast audience in the hope that some will find it interesting or useful. Natural language generation and automated narrative creation mean that one dataset can be interpreted in multiple ways, giving each targeted audience segment precisely what they need to know, without any confusing background noise.

This makes it ideal for corporate communications, where e.g. a company’s financial, customer and operations data can be interpreted and insights reported directly to whichever people in the organization are in the best position to make a change.

Read Also:
Location Analytics: The Missing Component in Retail

So, for example, if an algorithm running at a manufacturing company was to pick up on the fact that a bottleneck in production of one component was leading to an overall loss in revenue, it could create tailored reports for every department involved in the process, explaining the situation and the best course of action to correct it. Doing this manually would be a very time-consuming undertaking.

Just as with other high-tech developments of today – driverless cars spring to mind – earning the trust of humans is essential. The algorithms must allow for full sourcing and accountability. This is why although Natural Language Generation is the foundation of this sort of technology, the data and analytics which underpin it are just as important.

At the moment, automated narratives generally work well with structured data – information such as numbers and measurements which fit nicely into a spreadsheet and can be compared quantitatively. In the future, I would expect to see an increasing amount of the messy, unstructured data which we are increasingly generating and collecting included in these processes. For example video data could be analyzed and interpreted to add color and insight to reports. Going back to news reporting, CCTV footage could tell us if streets were empty or crowded with people at the time of an armed robbery. At the same time, social media analysis could bolster reports with an ad-hoc assessment of public sentiment towards any issue which is relevant.

Read Also:
Top Trends in Big Data for 2015

Narratives are one of the most important tools we have. Humans have always told stories – fictional, real or somewhere in between–as a way of passing on information and influencing events. Giving that power to computers may, to some, seem a step too far. But don’t we already often distrust the concept of “narrative”? The word is commonly used interchangeably with “spin” to suggest that someone is tailoring their depiction of events to suit their own needs. Computers can’t “spin” (unless they are programmed to, of course) so for news reporting, or conveying hard facts about a business, couldn’t they be seen as more trustworthy than humans?



SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
Location Analytics: The Missing Component in Retail

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Data From Your Home Could Cut Insurance Costs

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
Big Data Moves Toward Real-Time Analysis

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
IBM's machine-learning crystal ball can foresee renewable energy availability

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
Generating Values From Big Data Analytics for Your Business in 2017

Leave a Reply

Your email address will not be published. Required fields are marked *