data-science

Real data scientists have a rare hybrid of skill sets: Here’s what to look for

Real data scientists have a rare hybrid of skill sets: Here’s what to look for

 

Over the course of the last year many employers and hiring managers have heard that big data is the “hot new thing.” But as with all “hot new things,” there’s as much misinformation about data science as there are facts. Here are three misconceptions about big data and data science that I often encounter:

1. Big data is statistics and business intelligence with more data. There’s nothing new here.

This is a view often held by those with limited or no software development experience and it is plainly false. The perfect analogy for this is ice. Ice is just cold water right? There’s nothing new here. However, cooling down water doesn’t just change a quantitative property (temperature) but drastically changes its qualitative properties (transforming a liquid to a solid). The same can be said of more data. Big data strains and ultimately breaks the old paradigms of computation. With big data, all the data cannot fit into RAM and the traditional BI calculations would take years complete. Parallelization and distributed computation are obvious answers to scaling, but this is not always easy: Even a simple statistical tool like logistic regression does not easily parallelize. Distributed statistical computation is as different from traditional business analytics as ice is from water.

Read Also:
5 ways Big Data Will Shape The Enterprise In 2016

2. Data scientists are just rebranded software engineers.

Sometimes engineers with strong software development backgrounds will rebrand as data scientists for the salary premium. This can lead to subpar results. At the simplest level, debugging stats bugs becomes much harder. Engineers are trained to spot and solve programming bugs. But without a solid background in probability and statistics, they often have a hard time solving statistical bugs. Your code might be just fine but if you didn’t reweight your training examples correctly, your predictions will be off.

3. Data scientists don’t need to understand the business, the data will tell you everything.

People with machine-learning backgrounds often succumb to this one, in part because machine learning is so powerful. But it is not omnipotent. Searching for all possible correlations is time consuming, not to mention statistically problematic. Data scientists need to be guided by business intuition to help them distinguish between spurious correlations and real ones. Lack of domain expertise can lead to ill-founded conclusions (“more police officers leads to higher crime rates”) that prompt bad policy recommendations (“cut the policing staff in high crime neighborhoods”). Finally, having business intuition is also important for convincing key stakeholders. These stakeholders might not be data-scientists but are usually domain experts: Talking about your correlations in a language they can understand is key to getting the kind of institutional buy-in that is necessary for data science to achieve its promise.

Read Also:
Defining Information Governance: An Exploration with Industry Experts

Big data and data science is about building the right model that combines the right engineering, statistical, and business skills. Without all three, your data scientists will not be able to achieve everything they set out to do.



Data Innovation Summit 2017

30
Mar
2017
Data Innovation Summit 2017

30% off with code 7wData

Read Also:
Defining Information Governance: An Exploration with Industry Experts

Big Data Innovation Summit London

30
Mar
2017
Big Data Innovation Summit London

$200 off with code DATA200

Read Also:
8 Reasons Why Analytics / Machine Learning Models Fail To Get Deployed

Enterprise Data World 2017

2
Apr
2017
Enterprise Data World 2017

$200 off with code 7WDATA

Read Also:
Breaking Bad network chief calls using data to pick shows a disaster

Data Visualisation Summit San Francisco

19
Apr
2017
Data Visualisation Summit San Francisco

$200 off with code DATA200

Read Also:
Predictive Marketing: The Next Must-Have Technology for CMOs

Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
Predictive Marketing: The Next Must-Have Technology for CMOs
Read Also:
Altiscale Debuts Cloud for Self-Service Big Data Analytics -

Leave a Reply

Your email address will not be published. Required fields are marked *