The title “data scientist” was coined in 2008, and there are already thousands of professionals working in the field. Still, many organizations—even data scientists themselves—struggle to define what a data scientist is and what he or she does.
Part of the problem is that, even in the short time data scientists have been around, the definition has changed and continues to evolve in our big data world. Being a data scientist in the past required only math and statistics capabilities. Today, the role encompasses a unique combination of skills ranging from data engineer to statistician to business analyst. In other words, the job of data scientist has become several jobs rolled into one.
Complicating matters, we are about to enter a new era of the Information Age, in which data sets will grow at an exponential rate due to new tracking mechanisms applied to everything from smartphones and televisions to online shopping and social media. In the coming years, big data will become bigger, faster and more complex.
Data scientists will be challenged to convert this increasing volume, velocity and variety of data into meaningful insights on a massive scale—in real time. This will involve more intricate predictions and computations at scale, which in turn will spark the need for next-generation data scientists.
What do future data scientists look like? These men and women will be well-rounded professionals with both technical proficiency and business acumen, along with a mastery of statistics and dashes of programming, engineering and social sciences. They will be capable of tackling all aspects of big data problems, from data collection to analysis, interpretation and decision making.
To be successful, data scientists will need to learn a host of new and different skills. Becoming familiar with tools such as Python and Hadoop will be a priority. Techniques such as machine learning and data mining will be essential as well.
Data scientists will not only have to manage and analyze data, but they must also understand the business implications, communicate results and understand how data insights can be applied effectively to drive decisions.