aaron-kalb

What Catalog Shopping Can Teach Us About Data

What Catalog Shopping Can Teach Us About Data

You know all those product catalogs sitting on your coffee table (or in your recycle bin) and the websites you visit to buy gifts? They hold interesting lessons for how information can be consumed in the big data era.

Like an inventory, a catalog should list everything available for consumption (and nothing that isn’t), but that’s not enough. An Amazon product page, for example, includes pictures, specs, reviews, and recommendations. These bits of information, cumulatively, help the user decide what to buy.

Consuming data also requires rich context. Before embarking on a research project, an analyst needs to understand the shape of the data set, its source, whether it is up to date, who else has used it, and how it was used. To address those requirements, a catalog should provide data samples and statistical profiles, lineage, lists of users and stewards, and tips on how the data should be interpreted.

Yesterday’s data challenge was all about collecting relevant data for analysis and producing relevant reports, but these days many organizations possess the data and computational resources to answer almost any analytical question. But finding the most relevant, trustworthy data sets and metrics can be like finding a limited-edition Darth Vader Pez dispenser for Uncle Jack.

Read Also:
Executives still mistrust insights from data and analytics

A 21st century data catalog should do the following:

Some catalogs may try to be a source of truth about the right table to consult for a given purpose, the right categorization of a given value, or the right way to calculate a given metric. If universally consulted and respected, such prescriptive catalogs, hypothetically, could help everyone within an organization align and bring about an overall reduction in disparities and confusion. In practice, however, prescriptivism poses challenges for large enterprises (for example, when Hawaii is grouped with the other states by the finance department, but lumped in with Puerto Rico and Guam by the logistics team responsible for shipping).

A better approach is to document what people are doing: Who is querying which tables, viewing which reports, or using a particular calculation for a given metric? A data asset or technique used just one time by an intern probably isn’t trustworthy.;



Data Innovation Summit 2017

30
Mar
2017
Data Innovation Summit 2017

30% off with code 7wData

Read Also:
Hadoop Big Data Analytics Use Cases: Financial Services Banking on Disruption
Read Also:
Report: 55% of companies say security is biggest digital transformation challenge

Big Data Innovation Summit London

30
Mar
2017
Big Data Innovation Summit London

$200 off with code DATA200

Read Also:
Five Steps to Protect Your Critical Data From Insider Threats

Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
Executives still mistrust insights from data and analytics

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
8 ways big data analytics can be applied by any CEO

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Five Steps to Protect Your Critical Data From Insider Threats

Leave a Reply

Your email address will not be published. Required fields are marked *