Open Sourcing SparkADMM:

Open Sourcing SparkADMM: a Massively-parallel Framework for Solving Big Data Problems

Open Sourcing SparkADMM: a Massively-parallel Framework for Solving Big Data Problems

Training machine learning models over massive amounts of data is a cornerstone of many data analytics tasks. Usually this involves solving large optimization problems involving millions of optimization variables and constraints. Doing so over a parallel platform, like Spark or Hadoop, is crucial to making such computations scalable.

It is not always obvious how to solve large optimization problems in parallel. ADMM, which stands for the Alternating Directions Method of Multipliers, is a popular parallel optimization technique that provides a methodology for doing so. It permits the parallelization of a broad array of several important machine learning tasks, such as regression and classification, in a massively parallel fashion. For example, to train a classifier using ADMM over a very large dataset, a developer first splits the dataset and partitions it across multiple machines. A classifier is trained on each machine, based on the locally-stored portion of the dataset. Then, a global classifier learned from the entire dataset is extracted through consensus; ADMM averages out these classifiers and repeats the process through several iterations, forcing the local computations to be closer to the consensus value each time. This way, after several iterations, ADMM constructs a “consensus” classifier, which provably fits the entire dataset.

Read Also:
How artificial intelligence can predict suicide attempts before they occur

ADMM’s strength lies in its generality: it gives a template on how to take any serial machine learning algorithm designed to operate locally on a single dataset, and parallelize its execution over thousands of machines.

 



Sentiment Analysis Symposium

27
Jun
2017
Sentiment Analysis Symposium

15% off with code 7WDATA

Read Also:
Seven principles to help us strengthen our data infrastructure

AI, Machine Learning and Sentiment Analysis Applied to Finance

28
Jun
2017
AI, Machine Learning and Sentiment Analysis Applied to Finance

15% off with code 7WDATA

Read Also:
All Roads Lead to Digital Disruption

Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

28
Jun
2017
Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

15% off with code 7WDATA

Read Also:
Emerging economies need to harness the power of Big Data

Real Business Intelligence

11
Jul
2017
Real Business Intelligence

25% off with code RBIYM01

Read Also:
Seven principles to help us strengthen our data infrastructure

Advanced Analytics Forum

20
Sep
2017
Advanced Analytics Forum

15% off with code Discount15

Read Also:
Emerging economies need to harness the power of Big Data
Read Also:
Using 'Faked' Data is Key to Allaying Big Data Privacy Concerns

Leave a Reply

Your email address will not be published. Required fields are marked *