Bringing DevOps to Analytics and Data Science
- by 7wData
There's no shortage of material about DevOps on the Interwebs. But as you sift through mountains of information, it appears mostly skewed toward better and faster web and mobile app development. Look for details on how the thinking can be applied to the wicked world of data science or analytics and you'll be hard pressed to find much at all.
This seems odd, because a sustainable business isn't just carved out on the back of web and mobile apps. It's also dependent on intimately understanding customer data and pivoting towards the opportunities revealed - with yes, web and mobile apps.
It's not rocket science; it's data science, and web-scale companies understand it's fundamental for mega efficient business operations. It explains why Netflix has gone from stuffing DVDs through the mailbox to streaming their own shows, and how Amazon continues to redefine retail.
For most businesses the reality is quite different. Even though many consider themselves data-centric, they often struggle to get value from their data science initiatives. In part this might be explained by the ad-hoc nature of efforts, with new models only developed to support one-off projects. It might also be due to the massive amounts of effort needed to maintain and update these types of systems if they ever get deployed into the wild.
Blaming the tech, data or propeller heads is wrong. Today many businesses have mountains of data along with the scientists skilled in doing the math and building the analytical models.
They can even back this up with coding skills and have Big Data engineering teams laying down Kafka, Apache Spark and Hadoop-like technologies to deliver the number crunching grunt.
Yet problems still persist, with many businesses questioning the value of fragile analytics systems that need to be rebuilt every time a model changes. Worse still, any analysis that gets performed might deliver inconsistent results. If this leads to bad decision making, then data science can quickly become the organizational equivalent of snake oil.
The real reason behind these problems isn't tech or teams, it's the isolation of the data science and analytics function from IT development and engineering. Working alone, data science teams can find the right signal in all the data (even codify it), but lacking consistent access to production-like systems and test data means any results only get verified when a product goes live. When this eventually happens, these teams aren't set up to maintain or support them and can encounter stiff resistance from IT operations when systems are shown to be unreliable and fragile.
Faced with IT push-back, business units can of course be tempted to go it alone. This might be fine for a small one-off analytics project, but the cost and risk is prohibitive for the business as a whole if the result is application and code base disparity, overlapping or polluted datasets and complex infrastructure sprawl.
Worse than all the technology is the lost business opportunity.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
Evolving Your Data Architecture for Trustworthy Generative AI
18 April 2024
5 PM CET – 6 PM CET
Read MoreShift Difficult Problems Left with Graph Analysis on Streaming Data
29 April 2024
12 PM ET – 1 PM ET
Read More