What is the profession of data science really about? How does one best become a data scientist or grow a career as one? What does the Data Science Central community think about these questions? (Please chime in!)
We’ve all read about the shortage of data scientists from McKinsey, heard about the salaries, and know about the volume of recruiter emails. As a practicing data scientist at Pivotal (a leading vendor in open source, big data platforms specifically used for data science), I was recently interviewed on careers in data science. Because it has been a popular topic on Data Science Central, I wanted to share some of this perspective and see what other practitioners thought.
What Is Data Science About? What Is The Heritage And Current Practice?
From my own historical perspective—looking back to high school algebra—no one told us math could predict and help prevent someone from going to the ER or even the ICU. Depending on when you graduated college, you may have heard about algorithmic trading or analyzing the human genome, but, until very recently, we certainly didn’t hear about sentiment analysis on social media or machine learning on sensor data. Now, anyone with internet has exposure to apps driven by data science on YouTube, Facebook, Twitter, and your phone—analytics are embedded in every part of our lives, personal and business.
Today, most of the people I know in data science enjoy blazing new trails with the latest technology and lots of data, solving problems that the world could never address before. Our data science practice at Pivotal isn’t about reports, basic analytics, or business intelligence on data sitting inside traditional enterprise apps like CRM, ERP, SCM, and anything that took a paper-based workflow and stuck it in a database. While the data science heritage is most closely related to statistics, data science is more exploratory—our team is in search of new discoveries and “eureka” moments.
We don’t know what is possible when we start work—we only have a compass, not a map. We start looking at ten terabytes of data from 20 different systems that no one has ever holistically looked at before. We let the data take us places and envision what is possible with it, challenging everything we find along the way. We only know the data can be used to uncover, interpret, and optimize things in new ways that create value. Then, we use math to create a new method to improve something. Outside of our tribe, people often need an example to really grasp the fact that we go beyond pie chart creation and forecasts roll-ups. We generate those stories for them, and importantly, based on a lot more information, we tell you how accurate they probably are and what they are likely to be in the future.