How Can Lean Six Sigma Help Machine Learning?

How Can Lean Six Sigma Help Machine Learning?

How Can Lean Six Sigma Help Machine Learning?

The data cleansing phase alone is not sufficient to ensure the accuracy of the machine learning, when noise / bias exists in input data. The lean six sigma variance reduction can improve the accuracy of machine learning results.

By Joseph Chen, Senior Management and Architect in BI, Data Warehouse, Six Sigma, and Operations Research.

  I have been using Lean Six Sigma (LSS) to improve business processes for the past 10+ year and am very satisfied with its benefits. Recently, I’ve been working with a consulting firm and a software vendor to implement a machine learning (ML) model to predict remaining useful life (RUL) of service parts. The result which I feel most frustrated is the low accuracy of the resulting model. As shown below, if people measure the deviation as the absolute difference between the actual part life and the predicted one, the resulting model has 127, 60, and 36 days of average deviation for the selected 3 parts. I could not understand why the deviations are so large with machine learning.

Read Also:
Beyond Acronyms: Humanizing Big Data and Information Governance through Mindful Storytelling

After working with the consultants and data scientists, it appears that they can improve the deviation only by 10% through data cleansing. This puzzles me a lot. To me, such deviation, even after the 10% improvement, still renders the forecast useless to the business owners. This forces me to ask myself the following questions:

  The objective of the Lean Six Sigma (LSS) is to improve process performance by reducing its variance. The variance is defined as the sum squared errors (differences) between the LSS actual and forecast measures. The result of the LSS essentially is a statistical function (model) between a set of input / independent variables and the output / dependent variable(s), as show in the chart below.

By identifying the correlations between the input and output variables, the LSS model tells us how we can control the input variables in order to move the output variable(s) into our target values. Most importantly, LSS also requires the monitored process to be “stable”, i.e., minimizing the output variable variance, by minimizing the input variable variance, in order to achieve the so called “breakthrough” state.

Read Also:
Eliminating data security weaknesses: What is proactive asset protection?

As the chart below shows, if you get to your target (center) alone without variance control (the spread around the target in the left chart), there is no guarantee about the target you have achieved; if you reduce the variance without getting to the target (right chart), you miss your target. Only by keeping the variance small and center, LSS is able to ensure the process target is reached with precise precision and with a sustainable and optimal process performance. This is the major contribution of LSS.

 



Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
30 Top Videos, Tutorials & Courses on Machine Learning & Artificial Intelligence from 2016

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
Predictive analytics in healthcare helps manage high-risk patients

SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
Hybrid infrastructure gets a boost with Big Data
Read Also:
Know when your big data is telling big lies

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Learn What Smart Cities Mean to You

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
Predictive analytics in healthcare helps manage high-risk patients

Leave a Reply

Your email address will not be published. Required fields are marked *