Driving Data Science Productivity Without Compromising Quality

Driving Data Science Productivity Without Compromising Quality

Driving Data Science Productivity Without Compromising Quality

How will data science teams maintain quality standards in the face of advancing automation? Attend the IBM DataFirst Launch Event on Sep 27 in NYC and learn how to drive greater productivity from your data science teams without compromising the quality of the mission-critical business assets they produce.

Productivity in data science isn’t a matter of output in any quantitative sense. It’s more an issue of the quality of what data scientists produce.

In a data-science context, quality refers to the validity and relevance of the insights that statistical models are able to distill from the data. As I stated recently, the more data you have, the more stories that data scientists can tell with it, though many of those narratives may be entirely (albeit inadvertently) fictitious.

Given the paramount importance of actionable insights, the productivity of data science teams can’t be neatly reduced to throughput or any other quantitative metric. Data scientists can easily pump up their aggregate output along myriad dimensions, such as more sources, more data, more pipeline processes, more variables, more iterations, and more visualizations. But that doesn’t necessarily get them any closer to delivering high-quality analytics for predictive, prescriptive, and other uses.

Read Also:
How Location Based Marketing Impacts Online and Offline Retail

Likewise, you can’t always assume that throwing more data scientists at a problem will boost the quality of their collective output. Different data scientists may rely on different data sources; aggregate, cleanse, and sample them in different ways; incorporate different feature sets into their models, use different algorithmic and visualization approaches; employ different metrics of model fitness and predictive capability, and so on. Lack of standardized approaches and consistent documentation may frustrate efforts by third parties to compare which data scientists’ results, if anybody’s, are most valid in any particular instance of them working on a common problem.

As the number and variety of data scientists at work on a problem grows, they may be surfacing more spurious correlations than causal insights. And if they use different methodologies to produce their various results, it may get more difficult to identify who, if anyone, has delivered the highest-quality insights.

If you throw “citizen data scientists” into the mix, the quality problem can easily deteriorate unless you implement safeguard procedures (to be discussed later in this post). Citizen data scientist refers to the new generation of statistical explorers who lack the traditional academic and work backgrounds of established data scientists. Typically, these newbies are self-taught, self-starting, self-sufficient, and use self-service cloud-based statistical modeling and data engineering tools. They tend to use idiosyncratic methods, which may frustrate subsequent efforts by others to assess the validity of their findings.

Read Also:
Social Business Intelligence: The Next Big Thing!

 



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Data Science Skills and the Improbable Unicorn

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
White House worries about bad A.I. coding

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
Marketing & Advertising: Stats and Data Analysis

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
Things you need to know about Big Data

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
White House worries about bad A.I. coding

Leave a Reply

Your email address will not be published. Required fields are marked *