Cloud file syncing and sharing service Dropbox today disclosed more of the technical details underpinning the optical character recognition (OCR) feature that’s now available in the company’s flagship Android and iOS apps for people who work in organizations that pay for the Dropbox Business service tier.
The feature kicks into action after you scan a document using the camera on your mobile device. The app then lets you crop or rotate the document as necessary before saving it as a PDF in Dropbox. In August the company said it was using computer vision to detect the edges of documents that the app scans, but all this time the system for doing OCR has been a mystery. Today that changed.
OCR, of course, has been around for a while. And the idea of doing OCR with the help of deep learning — a type of artificial intelligence (AI) that entails training artificial neural networks on data and then getting the neural networks to make inferences about new data — is not in itself new. Open-source software to do it is available for the taking on GitHub. Google has looked to deep learning to more effectively do OCR on numbers in Google Street View imagery, for one thing.
Dropbox hasn’t talked much about AI over the years, even though it has lots of users and lots of data. So today’s lengthy blog post from senior software engineer Brad Neuberg is noteworthy.
The initial version of the OCR system drew on a commercially available software development kit (SDK). Dropbox opted to roll its own in order to save money and also increase accuracy, because the commercially available system was primarily built for actual hardware scanners, not scanners that use cameras on mobile devices.
To train its system, Dropbox looked to its user base.
“We began by collecting a representative set of donated document images that match what users might upload, such as receipts, invoices, letters, etc.,” Neuberg wrote. “To gather this set, we asked a small percentage of users whether they would donate some of their image files for us to improve our algorithms. At Dropbox, we take user privacy very seriously and thus made it clear that this was completely optional, and if donated, the files would be kept private and secure. We use a wide variety of safety precautions with such user-donated data, including never keeping donated data on local machines in permanent storage, maintaining extensive auditing, requiring strong authentication to access any of it, and more.
Chief Analytics Officer Spring 2017
15% off with code MP15
Big Data and Analytics for Healthcare Philadelphia
$200 off with code DATA200
10% off with code 7WDATASMX
Data Science Congress 2017
20% off with code 7wdata_DSC2017
20% off with code AIP17-7WDATA-20