EDISON Data Science Framework provides conceptual, instructional and policy components required to establish the Data Science profession.
Abstract The effective use of Data Science technologies requires new competences and skills and demands for new professions that should support all stages of the research data lifecycle from data production and input to data processing, storing, and obtained scientific results publishing and dissemination. This paper introduces the EDISON Data Science Framework (EDSF) that include conceptual, instructional and policy components required to establish sustainable graduation and training of the future Data Science professionals.
Introduction Modern research requires new types of specialists that are capable to support all stages of the research data lifecycle from data production and input to data processing, storing, and scientific results publishing and dissemination, which can jointly defined as the Data Science professions family. The future Data Scientists must possess knowledge (and obtain competencies and skills) in data mining and analytics, information visualisation and communication, as well as in statistics, engineering and computer science, and acquire experiences in the specific research or industry domain of their future work and specialisation. Although the Data Scientist is a key occupation in the data related professions family, other occupation are focused on other stages of the data lifecycle and supporting infrastructure.
The article describes the main components of the proposed EDISON Data Science Framework (EDSF) that is used as a basis for defining the Data Science Professions family. More extended information is provided about the Data Science Competence Framework (CF-DS) and the Data Science Body of Knowledge (DS-BoK) which are essential for defining consistent and customizable Data Science curricula.
EDISON Data Science Framework Figure 1 below illustrates the main components of the EDISON Data Science Framework (EDSF) that provides conceptual basis for the development of the Data Science profession (including reference to available documents):
The proposed framework provides basis for other components of the Data Science professional ecosystem such as
Data Science Competence Framework and Body of Knowledge The Data Science Competences Framework (CF-DS) is a cornerstone of the EDISON Data Science Framework and used for defining such components as Data Science Body of Knowledge (DS-BoK) and Data Science Model Curriculum (MC-DS). The CF-DS is defined in compliance with the European e-Competence Framework (e-CF3.0) and provides suggestions for e-CF3.0 extension with the Data Science related competences and skills.
Figure 2 illustrates the main CF-DS competence groups and their inter-relation:
The identified competence areas provide a better basis for defining education and training program for Data Science related jobs, re-skilling and professional certification.Knowledge of the scientific research methods and techniques makes the Data Scientist profession different from all previous professions.
The Data Management and Research Methods (or Business Process Management) are put as two outer circles to stress that these competences and knowledge are required from all Data Science professional profiles. It is recommended that both Data Management (or specifically Research Data Management) and Research Methods are included into all Data Science curricula.
Figure 2: Relations between identified Data Science competence groups for (a) general or research oriented and (b) business oriented professions/profiles
The CF-DS provides a basis for the definition of the Data Science Body of Knowledge (DS-BoK), the knowledge needed by the professionals to perform all data related processes of their profession. The BoK typically defines the content of a curriculum and is linked to CF-DS via learning outcomes that can be defined for the specific groups of trainees.