After last week’s post on promise and perils of big data, I wanted to pursue the discussion further around data quality. This is usually covered by “veracity and validity” as additional “Vs” of big data. In my experience, these two really go hand-in-hand and speak to the issue at the heart of driving business value leveraging big data. If users are not confident in the data quality, then it doesn’t matter what insights the system delivers as no adoption will occur.
Merriam-Webster defines something as valid when it is “well-grounded or justifiable: being at once relevant and meaningful; logically correct.” Veracity is defined as “something true.” In the big data conversation (as found on insideBIGDATA, veracity refers to the biases, noise and abnormalities found in data while validity refers to the accuracy and correctness.;