42-43165386-200x300

Pushing data quality beyond boundaries

Pushing data quality beyond boundaries

Throughout my long career of building and implementing data quality processes, I've consistently been told that data quality could not be implemented within data sources, because doing so would disrupt production systems. Therefore, source data was often copied to a central location – a staging area – where it was cleansed, transformed, unduplicated, restructured and loaded into new applications, such as an enterprise data warehouse or master data management hub.

This paradigm of dragging data from where it lives through data quality processes that exist elsewhere (and whose results are stored elsewhere) had its advantages. But one of its biggest disadvantages was the boundary it created – original data lived in its source, but quality data lived someplace else.

These boundaries multiply with the number of data sources an enterprise has. That's why a long-stated best practice has been to implement data quality processes as close to the data source as possible.;

 



Read Also:
HPE Is Moving to Microservices with Containers and Stackato
Read Also:
Will Snowflake spark a cloud data warehouse price war?
Read Also:
Vulnerability Is The Most Concerning ‘V’ Of Big Data
Read Also:
How Bad Data Undermines Business Results
Read Also:
Big Data Can Create Big Learning

Leave a Reply

Your email address will not be published. Required fields are marked *