What is Metadata and why is it as important as the data itself?

What is metadata and why is it as important as the data itself?

metadata. You may have heard the term before, and may have asked yourself either “what is metadata” or “why is it as important as data?” This article will be an attempt to clear up those two subjects. As this can often be quite dense, let’s jump right in!

Metadata can be explained in a few ways:

In short, metadata are important. I like to answer this “what is metadata” question as such: metadata are a shorthand representation of the data to which they refer. If we use analogies, we can think of metadata as references to data. Think about the last time you searched Google. That search started with the metadata you had in your mind about something you wanted to find. You may have began with a word, phrase, meme, place name, slang or something else. The possibilities for describing things seems endless. Certainly metadata schema can be simple or complex, but they all have some things in common.

I am not that old, but I am old enough to remember doing my job without digital aids. In the early 90s, I was a (then) young archaeologist working for Battelle Pacific Northwest Laboratory on the Hanford Project. Hanford is the US extraction facility for weapons grade plutonium. It was also where the United States processed enriched Uranium for the bombs dropped on Nagasaki and Hiroshima in 1945. Enrico Fermi had a lab there and the US Department of Energy saw this facility as having historical significance. There is a point to this anecdote. In 1992 and 1993, we had basic tcp/ip, but we did not have the array of digital tools we have today.

Provenance was the word used back then to describe the origins and the nature of objects. If I unearth an artifact and I take it out of its context, that is, I remove it from the site, what would happen to its scientific value? That depends on how well I describe that provenance and if I use the right keywords and organizational principles that are used to categorize, describe, analyze and curate similar objects and artifacts. This is why looting of archaeological sites is so damaging. Not only is the object lost but even if recovered it has lost its provenance or meaning!

This anecdote hopefully starts to form an idea that data on the data is as important as the data itself. Without having context, data has little reuse value.

Using the context of my job as an archaeologist, an object loses its scientific value if it loses its provenance or metadata. Every artifact is bagged and tagged using a numerical reference on the bag that corresponds to notes in a log. Often there are photos and sketches made of the artifact in-situ (in its original state) for future research. Archaeology is not about treasure hunting. Open Data is not just about storytelling. Both endeavors are fun and exciting. But the useful side of both Open Data and Archaeology is about the amount of reuse we can derive from our objects whether they be stones and bones or massive datasets.

Now that we have a more basic answer to our original question “what is metadata”, let’s take a look at what others have had to say. I use two definitions as a reference: one from the International Standards Organization (ISO), the other from White House Roundtables that I attended (both on Data Quality and on Open Data for Public-Private Collaboration), as we co-constructed a definition in the presence of experts.

The ISO and the White House Roundtables definition on data quality have some subtle differences. First, provenance in the White House context is defined as the metadata of a dataset. The second difference is that there is no “timeliness” dimension to the ISO definition of Data Quality. The ISO predates the widespread adoption of Open Data. Perhaps timeliness will become a part of the ISO in the future. The ISO provides a semantic definition to Data Quality which serves as the metadata requirement.

 

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

How Industrial Internet of Things Is Shaking Off Hype and Getting Real

26 Jun, 2017

How Industrial Internet of Things Is Shaking Off Hype and Getting Real The world is starting to have a greater …

Read more

What are the future trends for AI in business?

12 Jul, 2018

Michael Rovatsos is an AI researcher at the University of Edinburgh and former Director of the Centre for Intelligent Systems …

Read more

Therapy by chatbot? The promise and challenges in using AI for mental health

23 Jan, 2023

Just a year ago, Chukurah Ali had fulfilled a dream of owning her own bakery — Coco’s Desserts in St. …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.