Data science in the cloud

Data science in the cloud

Data science in the cloud

A year ago, I worked with large amounts of data stored on a server and on my laptop. Most of the data I used was in the form of text or binary files that were linked together with lookup tables. This method worked for me because I was the only one using the data, which meant I had to be very careful to not make any mistakes in the data analyses.

Since then a lot has changed. First of all, I changed jobs, from research fellow in academia to developer advocate for IBM Cloud Data Services. I made this change because I was fascinated to work with and learn more about data science in the cloud. As a developer advocate, I now get to play with all the new tools that IBM offers and write and talk about them.

The basics of the data science have not changed, but the tools I use have. I am still using Python most of the time, but where I store my data has become much simpler. For me, a typical workflow goes through the following steps: 

Read Also:
The Golden Age of Data Science

The first step, defining the question, is for me to come up with examples to show how to use new techniques and tools or to show how to work with interesting data sets. This approach is different when you work for a company that needs to solve problems or needs insights from their data to increase sales, for instance, or understand customer behavior. But the workflow is pretty much the same.

When you know what the question is that you want to answer, the time to look for the right data is now, and the data can be in any format and size and from many different sources. After a first, quick exploration of the data, you have to decide how to use it and, if needed, where to store it. Often I use an application programming interface (API) to collect data and then store it in a data warehouse such as dashDB, for example, if the data is structured.

Read Also:
How to Apply Machine Learning and Big Data Analytics to Real-Time Processing

 



Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
The Golden Age of Data Science

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
The New Data Scientist Venn Diagram

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
Introducing Graphaware Databridge: Graph Data Import Made Simple

SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
How Microsoft's other machine learning tricks could make its bots even smarter

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
Balancing privacy concerns for analytics design

Leave a Reply

Your email address will not be published. Required fields are marked *