Hacking the Data Science Radar with Data Science

Hacking the Data Science Radar with Data Science

Hacking the Data Science Radar with Data Science

Hacking the Data Science Radar with Data Science
04 Jul 16
This post was first published by original author Duncan Garmonsway and reproduced with his kind permission.
This post reverse-engineers the Mango Solutions Data Science Radar using
Programming (R)
Why hack? Because getting at the innards also reveals
What a good score is in each category
Which statements are most important
Whether scores are comparable across people
Whether you should strongly agree with the statement “On average, I spend at least 25% of my time manipulating data into analysis-ready formats”
The radar
Based on Likert-style responses to 24 provocative statements, the Data Science Radar visualises your skills along six axes, the “core attributes of a contemporary ‘Data Scientist’.” It looks like this.
Mango Solutions Data Science Radar
First attempt: Multivariate multiple regression
How can we score better? Hacking the url would be cheating , so instead, let’s use science: hypothesise -> test -> improve. Here are some initial guesses.
Each of the 24 statements relates to exactly one attribute, i.e. four statements per attribute.
The Likert values (strongly agree, agree, somewhat agree, etc.) are coded from 1 to 7 (since there are seven points on each axis).
There is a linear relationship between the coded agreement with the statements, and the attributes.
So something like
$$text{score}_{text{attribute}} = frac{1}{4} sum_{i = 1}^{4} text{answer}_i$$
where answeri = 1, 2, ⋯, 7 by encoding “Strongly disagree” as 1, up to “Strongly agree” as 7, including only four relevant answers per attribute. The best-possible set of answers would score 7 on every axis, and the worst set would score 1.
If the hypotheses are correct, then all we need to do to prove them is to record 24 sets of random answers, the resulting scores, and fit a multivariate linear model. We’d expect each score (outcome variable) to have four non-zero coefficients (out of the 24 input variables). Let’s try it.
# The first two aren't random, but they're still linearly independent of the # others, which is what matters. random_data <- read_csv("./data/radar-random.csv") lm1 <- lm(cbind(Communicator, `Data Wrangler`, Modeller, Programmer, Technologist, Visualiser) ~ ., data = random_data) lm1 ## ## Call: ## lm(formula = cbind(Communicator, `Data Wrangler`, Modeller, Programmer, ## Technologist, Visualiser) ~ ., data = random_data) ## ## Coefficients: ## Communicator Data Wrangler Modeller Programmer Technologist Visualiser ## (Intercept) 2.060e+00 2.422e+00 3.247e+00 6.658e-01 -1.331e+00 1.456e+00 ## q01 1.997e-01 -2.507e-02 2.602e-01 -1.103e-01 -5.866e-02 -7.103e-02 ## q02 -2.571e-01 2.729e-02 -4.514e-01 2.090e-01 1.554e-01 1.281e-01 ## q03 3.087e-01 1.744e-02 -3.471e-01 -1.303e-03 5.611e-02 1.978e-01 ## q04 4.356e-01 8.534e-04 -8.676e-03 -2.346e-02 -7.130e-02 -4.193e-02 ## q05 -2.524e-01 2.267e-01 8.732e-01 -1.559e-01 -1.907e-01 -3.885e-01 ## q06 -1.948e-01 1.545e-01 7.016e-01 -7.626e-02 -1.271e-01 -3.897e-01 ## q07 -7.925e-03 2.075e-01 4.423e-01 -1.089e-01 -2.015e-01 -2.247e-01 ## q08 8.902e-02 -4.810e-01 -1.246e-02 8.111e-02 -5.556e-02 -4.572e-02 ## q09 1.901e-01 5.174e-02 -5.260e-01 -9.428e-02 5.506e-02 2.620e-01 ## q10 9.750e-02 -1.248e-02 -2.365e-01 3.181e-02 1.557e-01 3.267e-01 ## q11 -2.099e-01 -5.220e-02 2.943e-01 2.032e-01 6.801e-02 -1.775e-01 ## q12 -1.000e-01 1.813e-15 7.000e-01 -1.333e-01 9.653e-16 -1.000e-01 ## q13 5.164e-02 2.647e-02 -3.386e-01 2.881e-01 -4.010e-03 1.428e-01 ## q14 1.211e-01 -8.162e-02 -3.835e-02 -2.508e-01 -4.963e-02 7.972e-02 ## q15 4.971e-03 5.

Read Also:
2017 Trends in Data Strategy

 



Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
What Artificial Intelligence Could Mean For Education

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
What Artificial Intelligence Could Mean For Education

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Go from BI to AI in Minutes

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
Why Deep Learning is Radically Different From Machine Learning

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
Predictions 2017: Master Data Management Market on Cusp of Unprecedented Growth!

Leave a Reply

Your email address will not be published. Required fields are marked *