Data curation is the art of maintaining the value of data. A data curator does this by collecting data from many different sources and then aggregating and integrating it into an information source that is many times more valuable than its independent parts. During this process, data might be annotated, tagged, presented, and published for various purposes. The goal is to keep the data valuable so it can be reused in as many business applications as possible.
"Through the curation process, data are organized, described, cleaned, enhanced, and preserved for public use, much like the work done on paintings or rare books to make the works accessible to the public now and in the future," according to ICPSR, which provides data stewardship and rich data resources to the scientific and academic communities. "With the modern Web, it's increasingly easy to post and share data. Without curation, however, data can be difficult to find, use, and interpret."
Data curation is just now starting to enter corporate parlance because of big data and the need to aggregate many different types of data from diverse sources to form a unique picture of a business situation.
In the not-so-distant days of storing and maintaining data that only came from transactional systems of record (SOR), IT performed rudimentary data curation through the processes of data retention and archiving. Decisions on which data to keep were driven largely by regulators and by how far back end user departments felt data needed to be stored. Little effort went into the inherent value that might be locked into data, or how data could be transformed into something larger and more useful.
In the last 24 months, these historical methods of data retention and value are starting to shift for a couple of reasons.
In addition, organizations are thinking of compelling use cases in data curation, and how the inherent value of each data element can be enriched by uniquely combining it with other elements to yield a breakthrough business application. One of these applications involves mapping, document integration, and 3D simulations that attorneys are starting to use in courtrooms to demonstrate a point.