Sexiest job... massive shortage... blah blah blah. Are you looking to get a real handle on the career paths available in "Data Science" and "Big Data?" Read this article for insight on where to look to sharpen the required entry-level skills.
I have recently had a lot of folks reach out, mainly on LinkedIn, looking for advice on getting started in "Data Science" and/or "Big Data." These people are generally interested in breaking into "the field" and need some direction on how to go about doing so.
A common theme in these requests, however (and I say this with the utmost respect), is a general lack of understanding of what it is they are actually asking. And that's fine; everyone needs to start somewhere, no matter what it is they are learning. Instead of answering these similar requests one by one, this post will serve to lay out some very basic concepts related to "Data Science" and/or "Big Data" career paths, and hopefully provide some advice on how to get one's feet wet in this convoluted field.
Before going any further, read the following articles. I mean it. Read. These. Articles.
The first article provides a general overview of some of the dominant concepts in data science, with the second being an update to these concepts from earlier this year. The third article provides a deeper treatment of the concepts of data science and Big Data. The fourth and final article is a quick discussion touching on some of the complexities and nuances surrounding the use of the term "data science" versus a number of other terms.
I have broken up the various professional possibilities into an easily manageable set of 5 career paths. While there may be mass outcry and widespread panic related to this particular division of roles, they really serve to categorize skills and professional responsibilities at a high level, and so I believe the following is quite useful for orienting newcomers to the myriad opportunities which exist in this professional realm, myriad opportunities which are often easily conflated and confused.
Back of the envelope analysis of analytics careers (click to enlarge)
This is essentially an IT role, akin to the database administrator. The data management professional is concerned with managing data and the infrastructure which supports it. There is little to no data analysis that takes place in such a role, and the use of languages such as Python and R is likely not necessary. SQL may be of use, as well as Hadoop-related query languages such as Hive or Pig.
Key technologies and skills to focus on:
This is the big Big Data non-analytic career path. The data infrastructure mentioned in the previous career path? Well, it needs to be designed and implemented, and the data engineer does that. If the data management professional is the car mechanic, data engineering is the automotive engineer. But don't get it twisted; both of these roles are crucial to both the delivery and continued functioning of your car, and are of equal importance when you are driving from point A to point B.
Truth be told, the technologies and skills required for data engineering and data management are similar; however, they each use and understand these concepts at different levels. I won't repeat the information shared in the role above (all of which is important to the data engineer), and will instead add some further reading specific to the data engineer.
I'm using business analyst in this context to refer to roles related strictly to the analysis and presentation of data. This includes reporting, dashboards, and anything referred to as "business intelligence." The role often requires interaction with (or querying of) databases, both relational and non-relational, as well as with Big Data frameworks.