Data Science Platforms: What are they? And why are they important?

Data Science Platforms: What are they? And why are they important?

Data Science Platforms: What are they? And why are they important?

As more companies recognize the need for a data science platform, more vendors are claiming they have one. Increasingly, we see companies describing their product as a “data science platform” without describing the features that make platforms so valuable. So we wanted to share our vision for the core capabilities a platform should have in order for it to be valuable to data science teams.

We see “the data science lifecycle” spanning three phases. Each phase has distinct demands that motivate capabilities for a data science platform:

To some degree, all data science projects go through these phases.

We’ll discuss these four lenses, describing the challenges involved in each, and what capabilities a good data science platform should provide.

Quantitative research starts with exploring the data to understand what you have. This might mean plotting data in different ways, examining different features, looking at the values of different variables, etc.

Ideation and exploration can be time consuming. The data sets can be large and unwieldy, or you may want to try new packages or tools. If you’re working on a team, unless you have ways of seeing work others have already done, you might be redoing work. Other people may have already developed insights, created clean data sets, or determined which features are useful and which are not.

Read Also:
Why businesses must make cyber security skills a priority in 2017

Through the process of exploring data, researchers formulate ideas they want to test. At this point, research often shifts from ad hoc work in notebooks to more hardened, batch scripts. People run an experiment, review the results, and make changes based on what they’ve learned.

This phase can be slow when experiments are computationally intensive (e.g., model training tasks). This is also where the “science” part of data science can be especially important: tracking variations in your experiments, ensuring past results are reproducible, getting feedback through a peer review process.

Data science work is only valuable insofar as it creates some impact on business outcomes. That means the work must be operationalized or productionized somehow, i.e., it must be integrated into business processes or decision-making processes.

 



Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
How Much Should You Charge Your Clients? Data Analytics Has The Answer!
Read Also:
Stitch Fix Uses Algorithms, Machine Learning To Dress Its Customers

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
Data Analytics for CFOs: Why new-age Business Intelligence systems are better than traditional MIS

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Why businesses must make cyber security skills a priority in 2017

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
6-Steps to a Better Business Intelligence Strategy

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
A methodology for solving problems with DataScience for Internet of Things

Leave a Reply

Your email address will not be published. Required fields are marked *