engineer-papers

How To Become A Machine Learning Expert In One Simple Step

How To Become A Machine Learning Expert In One Simple Step

This post looks at perhaps the most important, and often overlooked, step in learning machine learning, an aspect which can make the biggest difference in one's skill set.

The web is full of good explanations of machine learning algorithms. And every second applicant for a data science position has finished the Coursera course on machine learning. While it is important to understand the concepts behind the algorithms, one thing is even more important:

Theory will not help you choose good values for the 16 parameters a standard implementation of a random forest takes. The default values are good to get started, but which parameters should you modify depending on your data?

Choosing the right features, algorithms and parameters is an art. It's actually more like Karate than like math. You won't learn it from a book. You learn it by doing, by getting your hands dirty and applying algorithms to various data sets. By lots of trial and error. By having seen hundreds of successful applications.

Read Also:
Seven principles to help us strengthen our data infrastructure

Take the people winning Kaggle competitions, for example. One might think they are the researchers leading in the field of machine learning. But in fact, most of them spent a lot of time on Kaggle working with actual data. Check by yourself. Kaggle publishes profiles of top kagglers on their blog.

To get better at applying machine learning techniques, here is the one simple step I recommend. While it is simple as a concept, it will (of course) take perseverance and many hours of work.

It is easy to get this one wrong though. The prize money and the leaderboard could easily make you think, it is about winning kaggle competitions. The inglorious truth is, winning is a distraction. Winning a data science competition needs skills that are not relevant to a data science job. You need to build overly complex models and squeeze out the last fractions of a percentage point. Even the winning model of the Netflix prize has not been used in practice.;

Read Also:
Introducing sparklyr, an R Interface for Apache Spark


Chief Data Officer Europe
20 Feb

15% off with code CDO7W17

Read Also:
Short story on scaling an NLP problem without using a ton of hardware.
Predictive Analytics Innovation summit San Diego
22 Feb

$200 off with code DATA200

Read Also:
Virtual Reality, Artificial Intelligence and 3D Printing Team Up for the Production of the World's First AI-Engineered Car
Read Also:
The Emergence of the Citizen Data Scientist
Big Data Paris 2017
6 Mar
Big Data Paris 2017

15% off with code BDP17-7WDATA

Read Also:
Startups Disrupting Healthcare with AI and Machine Learning

Leave a Reply

Your email address will not be published. Required fields are marked *