Dropbox uses AI to to recognize words in documents scanned in its mobile apps

Dropbox uses AI to to recognize words in documents scanned in its mobile apps

Dropbox uses AI to to recognize words in documents scanned in its mobile apps

Cloud file syncing and sharing service Dropbox today disclosed more of the technical details underpinning the optical character recognition (OCR) feature that’s now available in the company’s flagship Android and iOS apps for people who work in organizations that pay for the Dropbox Business service tier.

The feature kicks into action after you scan a document using the camera on your mobile device. The app then lets you crop or rotate the document as necessary before saving it as a PDF in Dropbox. In August the company said it was using computer vision to detect the edges of documents that the app scans, but all this time the system for doing OCR has been a mystery. Today that changed.

OCR, of course, has been around for a while. And the idea of doing OCR with the help of deep learning — a type of artificial intelligence (AI) that entails training artificial neural networks on data and then getting the neural networks to make inferences about new data — is not in itself new. Open-source software to do it is available for the taking on GitHub. Google has looked to deep learning to more effectively do OCR on numbers in Google Street View imagery, for one thing.

Read Also:
Google's Cloud Bigtable Database Handles Petebyte-Scale Workloads

Dropbox hasn’t talked much about AI over the years, even though it has lots of users and lots of data. So today’s lengthy blog post from senior software engineer Brad Neuberg is noteworthy.

The initial version of the OCR system drew on a commercially available software development kit (SDK). Dropbox opted to roll its own in order to save money and also increase accuracy, because the commercially available system was primarily built for actual hardware scanners, not scanners that use cameras on mobile devices.

To train its system, Dropbox looked to its user base.

“We began by collecting a representative set of donated document images that match what users might upload, such as receipts, invoices, letters, etc.,” Neuberg wrote. “To gather this set, we asked a small percentage of users whether they would donate some of their image files for us to improve our algorithms. At Dropbox, we take user privacy very seriously and thus made it clear that this was completely optional, and if donated, the files would be kept private and secure. We use a wide variety of safety precautions with such user-donated data, including never keeping donated data on local machines in permanent storage, maintaining extensive auditing, requiring strong authentication to access any of it, and more.

Read Also:
Data science: What does it mean and how is it best applied?


Sentiment Analysis Symposium

27
Jun
2017
Sentiment Analysis Symposium

15% off with code 7WDATA

Read Also:
The New 4.0 Industrial Revolution is coming

Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

28
Jun
2017
Data Analytics and Behavioural Science Applied to Retail and Consumer Markets

15% off with code 7WDATA

Read Also:
Artificial Intelligence And The Future Of Marketing

AI, Machine Learning and Sentiment Analysis Applied to Finance

28
Jun
2017
AI, Machine Learning and Sentiment Analysis Applied to Finance

15% off with code 7WDATA

Read Also:
Data science: What does it mean and how is it best applied?

Real Business Intelligence

11
Jul
2017
Real Business Intelligence

25% off with code RBIYM01

Read Also:
Artificial Intelligence And The Future Of Marketing

Advanced Analytics Forum

20
Sep
2017
Advanced Analytics Forum

15% off with code Discount15

Read Also:
IoT and Big Data – Who Owns All the Data?

Leave a Reply

Your email address will not be published. Required fields are marked *