Machine Learning Is Making Unstructured Data Accessible

Machine Learning Is Making Unstructured Data Accessible

Machine Learning Is Making Unstructured Data Accessible

In a 2013 report by IBM, the amount of data created everyday was estimated to be roughly 2,500,000TB. It very likely greatly exceeds this now, as wearables, AI, and connected devices have increasingly embedded themselves into society, gathering a veritable tidal wave of additional information for organisations to interrogate.

This data comes in three forms: unstructured, semi-structured, and structured. Since the dawn of IT, structured data has been the main resource of analysts. Even today, this is the case. In a 2015 IDG Enterprise study on big data and analytics, 83% of IT professionals said structured data initiatives were a high priority at their organizations, while just 43% said unstructured data initiatives were a top priority. Yet, it is estimated that 90% of all data is either semi-structured or unstructured. For organizations, this is a tremendous number of potential insights to be leaving off the table.

Structured data is anything that fits in a relational database that exists within a certain set of values or contained a specific set of characteristics. Semi-structured data has no data model but some kind of structure, i.e. emails, zipped files, HR records and XML data. Unstructured data, meanwhile, is everything that does not fit into relational databases. This includes videos, powerpoint presentations, company records, social media, RSS, documents, and text.

Read Also:
Neuro-dynamic Programming: Building human curiosity into artificial intelligence

Both structured and unstructured data are necessary to use analytics to its potential, to build a full picture of a company’s health and to pinpoint areas for growth. Essentially, structured data analytics describes and explains what’s happening, while unstructured data analytics explains why it’s happening. Knowing what’s happening may enable you to form an idea of what’s going on and take action, but without understanding why you are running too high a risk that it’s wrong.

There are several reasons that companies have hitherto largely not analzyed their unstructured data in any meaningful way, central among which is simply the absence of necessary tools to do it. Advances in machine-learning have, however, meant that many now are, allowing organisations to analyze their mountains of unstructured content in ways they could not before.

Machine learning is valuable for the analysis of structured data, but indispensable when it comes to its unstructured counterpart because of the differences in scale. A human being simply cannot compute that amount of data.

Read Also:
DeepMind has best privacy infrastructure for handling NHS data, says co-founder

 



Chief Data Officer Europe
20 Feb

15% off with code CDO7W17

Read Also:
Tips for reading Big Data results correctly
Predictive Analytics Innovation summit San Diego
22 Feb

$200 off with code DATA200

Read Also:
Tips for reading Big Data results correctly
Read Also:
How to bring ‘intelligence’ to data handling
Read Also:
Big Data Opportunities for Telecommunications

Leave a Reply

Your email address will not be published. Required fields are marked *