The democratization of analytics means that more and more people are participating, which is good. But, if you manage people who deliver reports and data analysis to your desk, it should also give you pause. To drive good decisions, you'll have to ask some hard questions and ultimately educate your staff about the many ways it is possible to lie with statistics.
Here are three ways data driven decisions can go awry.
Use bad data, old data, or not enough data
Getting the data right is the most important aspect of getting the right answers. You want to make sure that the data used is clean, fresh, and that there is enough of it.
Bad data leads to unreliable conclusions. Data analysts have to be trained to look for outliers, missing values, and duplicates-all of which skew results in different ways.
Freshness matters too. If you're using a partner's pricing data and that data is a month old, when you base your joint product offer on it, who's going to eat the difference in costs? Selecting the wrong file version is easy to do given the proliferation of data. Having enough data is also critical. Imagine a customer satisfaction survey about umbrellas that randomly selects 200 persons from Arizona and Texas. Oops. There is no such thing as a 200 person trend! What if the same analysis included two million persons by adding New England, Canada, and the United Kingdom? For decades, we used 5% data samples because the software and the computers couldn't handle the whole dataset. That isn't true anymore. Big data washes out anomalies, providing more accuracy. Mo data is mo betta.
Don’t question the methods
Sometimes users apply analytics that they don't really understand or haven't fully vetted. Just because you downloaded an algorithm doesn't mean you have a good one. There are tradeoffs built into those algorithms that take scrutiny to discover. Universities can turn out new algorithms every 15 minutes. But students don't have to stand in front of an executive and be grilled about their results. When I say question analysts, I mean everyone, data scientists included. Management must depend on their smartest people. Still, if we ask, a data scientist may eventually admit, "I didn't have time to analyze enough data and could only run the A/B analysis once." Asking questions helps you know what you're dealing with. Try asking, if you had 10 more hours to work on this, how much more confident would you be of your analysis?
Use spreadsheets prolifically
Everyone uses spreadsheets and yet they are riddled with errors-up to 88% contain errors. A person uses a formula one day, and then three days later copies and pastes something that overwrites that formula. Voilà, an error is born. If you're using a spreadsheet to look at data, keep data and formulas separated. There are lots of interesting stories about this type of mistake, including a$6 billion loss at JP Morgan Chase.
Let's face it; there are lots of ways to make bad decisions. Making good data-driven decisions requires constant vigilance, some uncomfortable questioning, and a dose of ongoing education about how to use data in the right way. If your analysis is poor, your company will be poor. You'd rather not have to explain interesting stories about bad decisions at your next board meeting.