data-science

Real data scientists have a rare hybrid of skill sets: Here’s what to look for

Real data scientists have a rare hybrid of skill sets: Here’s what to look for

 

Over the course of the last year many employers and hiring managers have heard that big data is the “hot new thing.” But as with all “hot new things,” there’s as much misinformation about data science as there are facts. Here are three misconceptions about big data and data science that I often encounter:

1. Big data is statistics and business intelligence with more data. There’s nothing new here.

This is a view often held by those with limited or no software development experience and it is plainly false. The perfect analogy for this is ice. Ice is just cold water right? There’s nothing new here. However, cooling down water doesn’t just change a quantitative property (temperature) but drastically changes its qualitative properties (transforming a liquid to a solid). The same can be said of more data. Big data strains and ultimately breaks the old paradigms of computation. With big data, all the data cannot fit into RAM and the traditional BI calculations would take years complete. Parallelization and distributed computation are obvious answers to scaling, but this is not always easy: Even a simple statistical tool like logistic regression does not easily parallelize. Distributed statistical computation is as different from traditional business analytics as ice is from water.

Read Also:
Doug Cutting Reflects on Hadoop’s Impact, Future

2. Data scientists are just rebranded software engineers.

Sometimes engineers with strong software development backgrounds will rebrand as data scientists for the salary premium. This can lead to subpar results. At the simplest level, debugging stats bugs becomes much harder. Engineers are trained to spot and solve programming bugs. But without a solid background in probability and statistics, they often have a hard time solving statistical bugs. Your code might be just fine but if you didn’t reweight your training examples correctly, your predictions will be off.

3. Data scientists don’t need to understand the business, the data will tell you everything.

People with machine-learning backgrounds often succumb to this one, in part because machine learning is so powerful. But it is not omnipotent. Searching for all possible correlations is time consuming, not to mention statistically problematic. Data scientists need to be guided by business intuition to help them distinguish between spurious correlations and real ones. Lack of domain expertise can lead to ill-founded conclusions (“more police officers leads to higher crime rates”) that prompt bad policy recommendations (“cut the policing staff in high crime neighborhoods”). Finally, having business intuition is also important for convincing key stakeholders. These stakeholders might not be data-scientists but are usually domain experts: Talking about your correlations in a language they can understand is key to getting the kind of institutional buy-in that is necessary for data science to achieve its promise.

Read Also:
What is Your City’s Digital Transformation IQ?

Big data and data science is about building the right model that combines the right engineering, statistical, and business skills. Without all three, your data scientists will not be able to achieve everything they set out to do.



Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
What is Your City’s Digital Transformation IQ?

AI Paris

6
Jun
2017
AI Paris

20% off with code AIP17-7WDATA-20

Read Also:
Acxiom Data Called Suited For Terrorist Hunts

Chief Data Officer Summit San Francisco

7
Jun
2017
Chief Data Officer Summit San Francisco

$200 off with code DATA200

Read Also:
12 Ways To Connect Data Analytics To Business Outcomes

Customer Analytics Innovation Summit Chicago

7
Jun
2017
Customer Analytics Innovation Summit Chicago

$200 off with code DATA200

Read Also:
How To Make A Bad Data-Driven Decision In Three Easy Steps

HR & Workforce Analytics Innovation Summit 2017 London

12
Jun
2017
HR & Workforce Analytics Innovation Summit 2017 London

$200 off with code DATA200

Read Also:
Predictive analytics: What are the challenges and opportunities?
Read Also:
12 Ways To Connect Data Analytics To Business Outcomes

Leave a Reply

Your email address will not be published. Required fields are marked *