As analytics maturity progresses concurrently with advances in modern business intelligence, we are seeing more innovative players in the areas of analytics automation. From automating visualizations, descriptive and predictive models to creating intelligent, textual natural language summaries of analytical findings, analytics automation has arrived along with the era of citizen data science. In this article, I will showcase a few analytics automation technologies and what aspects of these offerings should be embraced.
Automating analytics is not a new concept. I have written articles before on SAP Predictive Analytics (KXEN), Tableau, TIBCO Spotfire, Microsoft Power BI and other vendors providing wonderful suggested data visualizations, outlier detection, forecasting, clustering and intelligent predictive analytics capabilities in visual analytics tools. These applications do not replace an analyst. They aid an analyst.
If an analyst or business user feeds one of these tools poor quality data, the predictive results will be poor. Think garbage in, garbage out. I believe there still is an art to designing and providing these tools data elements that accurately reflect a business process. Even if automated analytics can work through millions of variable combinations that would be unreasonable for a human to do, a human might not understand the results or a machine might not be able to decipher nuances in business context. The true beauty of automating analytics is seen when combining the human mind with the power of intelligent machine learning. Best-in-class automated analytics provides machine learning results in human-friendly, natural language along with guided recommendations.
Analytics and data science communities have been discussing the strengths and weaknesses of analytics automation for a while now. The cries for data control, security, quality and statistical domain knowledge versus quick business decision empowerment reflect many of the same battle cries heard at the beginning of the self-service BI revolution. Don’t waste your energy trying to block analytics automation. Be a hero and help less data-savvy business users understand how to use these tools properly.
Historically advanced analytics and data mining practitioners did have tools that automated steps in the overall machine learning process. For example during the data understanding step of the Cross Industry Standard Process for Data Mining (CRISP-DM), a data scientist might run routines to rapidly identify attributes with the most information gain. The example below is from classic open source Weka Explorer for data mining using Microsoft Adventure Works sample Bike Buyers data.
On the data load and preprocess screen, data quality and skewness of manually selected fields can be reviewed. On the select attributes screen, information gain and ranking routines can be run. Weka provides a vast library of predictive algorithms and granular control of model settings. The steps are not hard but there is a steep learning curve in predictive modeling data prep and technique. The CRISP-DM process can be quite time consuming. As a result, predictive modeling projects are often time boxed.
After working through all of the CRISP-DM steps and finding a reliable predictive model, the algorithm output might get embedded into a business application, report or dashboard.