Today’s CIOs are no strangers to the concept of the Enterprise Data Lake. Oftentimes, an enterprise data lake is viewed as a panacea for all a CIO’s data ills, including being viewed as the ‘holy grail’ for those trying to spur digital transformation. Yet many CIOs are still struggling to see the payoffs from such data lake investments.
Well known Big Data commentator, Bernard Marr, observed that this is likely the result of disconnected goals and lack of communication between professionals working directly with enterprise data, and those responsible for business performance. In fact, based on a recent study from The Economist, although seventy percent of business executives rated sales and marketing analytics as ‘very’ or ‘extremely important’, onlytwo percentsay they have achieved ‘broad, positive impact.’ A similar study by research firm Gartner said: the maturity of businesses and how they use Business Intelligence effectively is still low with only five percent of respondents indicating they have the ability to completely take advantage of advanced analytics.
While both studies reveal the gap between data analysts and decision makers in analytics initiatives, what they overlook is the role data lakes play in amending such disconnections. Ultimately, CIOs need to take a more strategic approach to enterprise data lakes. Here are three questions CIOs should ask themselves in order to reap the full benefits of their data lakes.
It seems almost instinctive for CIOs to start any data lake initiative by considering to what extent their data lake will be used and build a strategy accordingly. Who will provide data, which departments, which data sets, and who will consume the data? These are important questions to ask when CIOs are determining their data lake strategy.
Choosing a data lake technology, e.g. Hadoop or Amazon S3, is only step one of the journey and does not guarantee success. In order to achieve a true ‘victory’, a CIO’s strategy needs to not only consider the technology used, but more importantly, the people and processes required to make the most effective use of the data lake. Hence, they need to find the right tools (Hadoop or other related Big Data technologies): train their teams with the right skill sets and implement processes to facilitate effective use of the information; and enable access without losing control of data governance or compliance.
Oftentimes, ‘data is simply dumped into the data lake’ without prior planning of how it’s going to be used, and only a handful of highly skilled data scientists who master advanced techniques such as predictive analytics or machine learning can make use of it. If a data lake is built without proper enablement of people or establishment of process, CIOs will end up facing an array of disparate systems and solutions that, in the end, can’tconnect orscale, making it impossible to keep up with escalating business needs and data demands and recreating yet another accidental architecture.