The role of data and analytics in business continues to grow. To make sense of their plethora of data, businesses are looking to data scientists for help. Job site, indeed.com, shows a continued growth in “data scientist” positions. To better understand the field of data science, we studied hundreds of data professionals.
In that study, we found that data scientists are not created equal. That is, data professionals differ with respect to the skills they possess. For example, some professionals are proficient in statistical and mathematical skills while others are proficient in computer science skills. Still others have a strong business acumen. In the current analysis, I want to determine the breadth of talent that data professionals possess to better understand the possibility of finding a single data scientist who is skilled in all areas. First, let’s review the study sample and the method of how we measured talent.
We surveyed hundreds of data professionals to tell us about their skills in five areas: Business, Technology, Math & Modeling, Programming and Statistics. Each skill area included five specific skills, totaling 25 different data skills in all.
For example, in the Business Skills area, data professionals were asked to rate their proficiency in such specific skills as “Business development,” and “Governance & Compliance (e.g., security).” In the Technology Skills area, they were asked to rate their proficiency in such skills as “Big and Distributed Data (e.g., Hadoop, Map/Reduce, Spark),” and “Managing unstructured data (e.g., noSQL).” In the Statistics Skills, they were asked to rate their proficiency in such skills as “Statistics and statistical modeling (e.g., general linear model, ANOVA, MANOVA, Spatio-temporal, Geographical Information System (GIS)),” and “Science/Scientific Method (e.g., experimental design, research design).”
For each of the 25 skills, respondents were asked to tell us their level proficiency using the following scale:
This rating scale is based on a proficiency rating scale used by NIH. Definitions for each proficiency level were fully defined in the instructions to the data professionals.
The different levels of proficiency are defined around the data scientists ability to give or need to receive help. In the instructions to the data professionals, the “Intermediate” level of proficiency was defined as the ability “to successfully complete tasks as requested.” We used that proficiency level (i.e., Intermediate) as the minimum acceptable level of proficiency for each data skill. The proficiency levels below the Intermediate level (i.e., Novice, Fundamental Awareness, Don’t Know) were defined by an increasing need for help on the part of the data professional. Proficiency levels above the Intermediate level (i.e., Advanced, Expert) were defined by the data professional’s increasing ability to give help or be known by others as “a person to ask.”
We looked at the level of proficiency for the 25 different data skills across four different job roles. As is seen in Figure 1, data professionals tend to be skilled in areas that are appropriate for their job role (see green-shaded areas in Figure 1). Specifically, Business Management data professionals show the most proficiency in Business Skills. Researchers, on the other hand, show lowest level of proficiency in Business Skills and the highest in Statistics Skills.
For many of the data skills, the typical data professional does not have the minimum level of proficiency to do be successful at work, no matter their role (see yellow- and red-shaded areas in Figure 1).