Microsoft and Google Want to Let Artificial Intelligence Loose on Our Most Private Data
New ways to use machine learning without risking sensitive data could unlock new ideas in industries like health care and finance.
April 19, 2016
The recent emergence of a powerful machine-learning technique known as deep learning has made computing giants such as Google, Facebook, and Microsoft even hungrier for data. It’s what lets software learn to do things like recognize images or understand language.
Yet many problems where deep learning could be most valuable involve data that is hard to come by or is held by organizations that are unwilling to share it. And as Apple CEO Tim Cook puts it, some consumers are already concerned about companies “gobbling up” their personal information.
“A lot of people who hold sensitive data sets like medical images are just not going to share them for legal and regulatory concerns,” says Vitaly Shmatikov , a professor at Cornell Tech who studies privacy. “In some sense we’re depriving these people from the benefits of deep learning.”
Shmatikov and researchers at Microsoft and Google are all working on ways to get around that privacy problem. By providing ways to use and train the artificial neural networks used in deep learning without needing to gobble up everything, they hope to be able to train smarter software, and convince the guardians of sensitive data to make use of such systems.
Shmatikov and colleague Reza Shokri are testing what they call “ privacy-preserving deep learning .” It provides a way to get the benefit of multiple organizations—say, different hospitals—combining their data to train deep-learning software without having to take the risk of actually sharing it.
Each organization trains deep-learning algorithms on its own data, and then shares only key parameters from the trained software.