The arrival of big data means many things. For one, it has exposed the limits of data stalwarts such as structured query language (SQL) and relational databases. Sure, an organization's internal and structured data still matters – but it’s not the only game in town these days. Indeed, the vast majority of what we call big data is both unstructured and lies outside of enterprise walls. Grappling with issues around big data has persuaded many organizations to ink in a new box on the org chart – one for the chief data officer (CDO) role.
Still think big data does not dramatically affect data governance and even basic management on many levels? Think again. Let’s take a closer look at what big data – specifically, the chief data officer role – means for traditional data governance.
By way of background, here’s a simple definition of a CDO. In my view, it represents an executive formally charged with how the enterprise uses and protects both internal and external information. To this end, the person in the chief data officer role is responsible for enterprisewide data governance. (As I'll explain later, the role may very well overlap with other executive positions.)
Before continuing, a disclaimer is in order. Much like big data, there's anything but universal agreement on the definition and responsibilities of a proper CDO. Because it's such a nascent and evolving role, it's much murkier than a garden-variety CFO or CEO. What's more, many prominent organizations choose not to employ them. Examples include Amazon, Netflix, Facebook and Google.
Of course, it's folly to think of all organizations as the same. The business and data challenges facing a midsized American company differ dramatically from those faced by large health care organizations and behemoths like Google. Oh, and legislation differs greatly based on industry and geography. To expect all organizations to manage or govern their data identically or even similarly is just plain silly.
With respect to traditional data governance, my experience shows there are generally three types of organizations:
Let's assume for the sake of argument that Type A and B organizations are doing data governance well.