Have a case of dirty data? Treat it with a dose of Data Transformation
- by 7wData
What great terms – “Dirty” and “Ugly” data. The descriptions are damning, suggesting something that should never be seen, let alone experienced. If the data under discussion is yours, how would you respond?
a) Pretend it’s nothing to do with you?
b) Try to shift the conversation onto another topic? or
c) Walk away and give up on the data altogether?
The answer is “None of the above”. Dirty and ugly are terms used to describe data that has not been cleaned, formatted and enriched for optimal use in reporting. It may contain duplications or data errors that make it impossible to perform accurate queries, or the way the data is structured makes it too cumbersome for reporting using your BI platform. Under these circumstances, queries become slow and reports difficult to generate. It can take hours or days to obtain information that is urgently needed.
The reality is, because we are human and we aren’t perfect, every Company has some degree of dirty or ugly data. For some organisations it’s a minor inconvenience. For other companies, poor data can severely limit reporting possibilities. No matter how well-prepared a Business is, it’s hard to identify the true extent of any data problems until you’ve deployed your BI platform and start to generate reports.
It normally goes like this… Business users read an article or attend an event and learn what’s possible with analytics. Then they get excited about the opportunities within their own company. They convince the right people and before you know it, an analytics solution is deployed. It’s about this time they realise they have a problem with their data. As they try to untangle the database and cleanse the data, the time investment for the project goes through the roof and frustration sets in.
The solution could require consulting time to get the reports built by an outside expert or employment of additional analysts to take on the job in-house.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Shift Difficult Problems Left with Graph Analysis on Streaming Data
29 April 2024
12 PM ET – 1 PM ET
Read MoreYou Might Be Interested In
The Story of How Data Virtualization Became Data Virtualization
2 Aug, 2017Someone recently asked me about the origin of the Data Virtualization category name. From my unique perspective as the software …
Augmenting Self-Driving Cars with Human Capabilities
5 Apr, 2017In my last piece for Data Informed (read it here) I wrote about an MIT conference suggesting that fully autonomous …
5 benefits of geospatial visualization when analyzing IoT data
25 Jul, 2020The advent of the Fourth Industrial Revolution has beckoned the advancement of IoT, generating unprecedented quantities of real-time, streaming data. …
Recent Jobs
Do You Want to Share Your Story?
Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.