Streaming technologies have been around for years, but as Felix Liao recently blogged, the numbers and types of use cases that can take advantage of these technologies have now increased exponentially. I've blogged about why streaming is the most effective way to handle the volume, variety and velocity of big data. That's because it provides a faster way to gain business insights from big data than what traditional store-it-first-analyze-it-later approaches are typically capable of delivering.
So, we can agree that streaming is beneficial for big data analytics. But could streaming also be beneficial for data quality?
I believe that it is. As I see it, using analytics to perform a rapid data quality assessment is one of the biggest overlaps between analytics and data quality – other than analytical models being better with better data. Indeed, this approach to assessing data quality has become a necessity now that we have so many data sources within and outside of the enterprise for business users to consider.
Even when you employ a reusable set of data management processes to manage data where it lives so that data quality rules are consistently applied across all data sources, it doesn’t change the fact that some sources will have higher data quality levels than others.
Chief Analytics Officer Spring 2017
15% off with code MP15
Big Data and Analytics for Healthcare Philadelphia
$200 off with code DATA200
10% off with code 7WDATASMX
Data Science Congress 2017
20% off with code 7wdata_DSC2017
20% off with code AIP17-7WDATA-20