One of my favorite examples of why so many big data projects fail comes from a book that was written decades before “big data” was even conceived. In Douglas Adams’ The Hitchhiker’s Guide to the Galaxy, a race of creatures build a supercomputer to calculate the meaning of “life, the universe, and everything.” After hundreds of years of processing, the computer announces that the answer is “42.” When the beings protest, the computer calmly suggests that now they have the answer, they need to know what the actual question is — a task that requires a much bigger and more sophisticated computer. This is a wonderful parable for big data because it illustrates one quintessential fact: data on its own is meaningless. Remember the value of data is not the data itself – it’s what you do with the data. For data to be useful you first need to know what data you need, otherwise you just get tempted to know everything and that’s not a strategy, it’s an act of desperation that is doomed to end in failure. Why go to all the time and trouble collecting data that you won’t or can’t use to deliver business insights? You must focus on the things that matter the most otherwise you’ll drown in data. Data is a strategic asset but it’s only valuable if it’s used constructively and appropriately to deliver results.
This is why it’s so important to start with the right questions. If you are clear about what you are trying to achieve then you can think about the questions to which you need answers. For example, if your strategy is to increase your customer base, questions that you will need answers to might include, ‘Who are currently our customers?’, ‘What are the demographics of our most valuable customers?’ and ‘What is the lifetime value of our customers?’. When you know the questions you need answered then it’s much easier to identify the data you need to access in order to answer those key questions. For example, I worked with a small fashion retail company that had no data other than their traditional sales data.