How can organisations avoid quality degradation, given the sheer volumes and many disparate data sources they have to manage?
With the birth of the Internet and the pervasive nature of technology, it’s no wonder the majority of data in the world has been generated over the last few years. As we continue to embrace the Internet of Things (IoT), it’s safe to say we’re on track to beating any and all records of data generation year-on-year.
This explosion of data is pushing enterprises in a more data- driven direction; organisations are performing complex analysis on their data to develop new revenue streams, streamline operations and enhance the customer experience.
One of the key concerns during this analysis is that of the data’s quality. With IT systems comprising of legacy, cloud and standalone applications, plus the integration of social network and third party feeds, synchronising this data is a real challenge.
Over time, original reference data can often become fragmented for a myriad of reasons. However, the three we see most commonly are:
Master data being held across multiple applications, often with different data architectures; Adependency on the end user ensuring their information is updated regularly, despite the user not having any motivation to do so; and Updating data in only one application even though it should be updated in multiple systems in real time, without impacting the existing set up.
As soon as the data is out of sync, the effort and money invested in data analytics is effectively wasted.
Data quality management poses its own challenges. Synchronising data across systems often requires complex string comparison operations with the process sometimes needing costly changes to existing applications’ data design.
However, there is already a solution that can be used to improve data quality, one that is rooted in existing best practices of software development.