Since the dawn of big data, privacy concerns have overshadowed every advancement and every new algorithm. This is the same for machine learning, which learns from big data to essentially think for itself. This presents an entirely new threat to privacy, opening up volumes of data for analysis on a whole new scale. Many standard applications of machine learning and statistics will, by default, compromise the privacy of individuals represented in the data sets. They are also vulnerable to hackers, who would edit the training data, compromising both the data and the final goal of the algorithm.
A recent project that demonstrates how machine learning could directly be used in the invasion of privacy was carried out by researchers at Cornell Tech in New York. Ph.D. candidate, Richard McPherson, Professor Vitaly Shmatikov, and Reza Shokri applied basic machine learning algorithms - not even specifically written for the purpose - to identify people in blurred and pixelated images. In tests where humans had no chance of identifying the person (0.19%), they say the algorithm had 71% accuracy. This went up to 83% when the computer was given five opportunities.
Blurred and pixelated images have long been used to disguise people and objects. License plate numbers are routinely blurred on television, as well as the faces of underage criminals, victims of particularly horrific crimes and tragedies, and those wishing to remain anonymous when interviewed. YouTube even offers its own facial blurring tool, developed to mask protestors and prevent potential retribution.
What’s possibly the most shocking part of the research is the ease with which the researchers were able to make it work.
Chief Analytics Officer Europe
15% off with code 7WDCAO17
Chief Analytics Officer Spring 2017
15% off with code MP15
Big Data and Analytics for Healthcare Philadelphia
$200 off with code DATA200
10% off with code 7WDATASMX
Data Science Congress 2017
20% off with code 7wdata_DSC2017