Understanding Machine Learning

Understanding Machine Learning

Understanding Machine Learning

Learn how you can modernize your data warehouse with Apache Hadoop. View an on-demand webinar now. Brought to you in partnership with Hortonworks.

What exactly is machine learning?

The simplest definition I came across:

Let’s break that down to set some foundations on which to build our machine learning knowledge.

Branch of AI: Artificial intelligence is the study and development by which a computer and its systems are given the ability to successfully accomplish tasks that would typically require a human’s intelligent behavior. Machine learning is a part of that process. It’s the technology and process by which we train the computer to accomplish the said task.

Explores ways: Machine learning techniques are still emerging. Some models for training a computer are already recognized and used (as we will see below), but it is expected that more will be developed with time. The idea to be remembered here is that different models can be used when training a computer. Different business problems require different models.

Get computers to improve their performance: For a computer to accomplish a task with AI, it needs practice and adaptation. A machine learning model needs to be trained using data and in most cases, a little human help.

Read Also:
How You Can Improve Customer Experience With Fast Data Analytics

Based on experience: providing an AI with experience is another way of saying – to provide it with data. As more data is fed into the system, the more accurately the computer can respond to it and to future data that it will encounter. More accuracy in understanding the data means a better chance to successfully accomplish its given task or to increase its degree of confidence when providing predictive insight.

Machine learning is often referred to as magical or a black box:

Let’s take a look at the training process itself to better understand how machine learning can create value with data.

Collect: Machine learning is dependent on data. The first step is to make sure you have the right data as dictated by the problem you are trying to solve. Consider your ability to collect it, its source, the required format, and so on. Clean: Data can be generated by different sources, contained in different file formats, and expressed in different languages. It might be required to add or remove information from your data set, as some instances might be missing information while others might contain undesired or irrelevant entries. Its preparation will impact its usability and the reliability of the outcome.  Split: Depending on the size of your data set, only a portion might be required. This is usually referred to as sampling. From the chosen sample, your data should be split into two groups: one to train the algorithm and the other to evaluate it. Train: As commonly seen with neural networks, this stage essentially aims at finding the mathematical function that will accurately accomplish the chosen goal. Using a portion of your data set, the algorithm will attempt to process the data, measure its own performance and auto-adjust its parameters (also called backpropagation) until it can consistently produce the desired outcome with sufficient reliability. Evaluate: Once the algorithm performs well on the training data, its performance is measured again with data that it has not yet seen. Additional adjustments are made when needed. This process allows you to prevent overfitting, which happens when the learning algorithm performs well but only with your training data. Optimize: The model is optimized for integration within the destined application to ensure it is as lightweight and as fast as possible.

Read Also:
Data scientists compete to create cancer-detection algorithms

There are many different models that can be used in machine learning but they are typically grouped into three different types of learning: supervised, unsupervised, and reinforcement. Depending on the task to complete, some models are more appropriate and better performing than others.

 



Sentiment Analysis Symposium

27
Jun
2017
Sentiment Analysis Symposium

15% off with code 7WDATA

Read Also:
How to Make Data Sexy and Why Our Future Depends on It

Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

28
Jun
2017
Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

15% off with code 7WDATA

Read Also:
Analytics and the art of management

AI, Machine Learning and Sentiment Analysis Applied to Finance

28
Jun
2017
AI, Machine Learning and Sentiment Analysis Applied to Finance

15% off with code 7WDATA

Read Also:
Apache Spark: A Unified Engine for Big Data Processing

Real Business Intelligence

11
Jul
2017
Real Business Intelligence

25% off with code RBIYM01

Read Also:
Why Data Scientists Create Poor Data Products? 5 Humbling Lessons
Read Also:
Why Python (IT Best Kept Secret Is Optimization)

Advanced Analytics Forum

20
Sep
2017
Advanced Analytics Forum

15% off with code Discount15

Read Also:
Data Warehousing: A Competitive Disadvantage!

Leave a Reply

Your email address will not be published. Required fields are marked *