Low-Budget Natural Language Processing

Low-Budget Natural Language Processing

Low-Budget Natural Language Processing
If you’ve ever talked to Siri or performed a Google Search using colloquial language and gotten the right answer, you probably had that magical feeling of being understood by a machine. The discipline that studies the interactions between human languages and computers is called natural language processing (NLP), and it’s a very active field. Companies and computer scientists are developing amazing techniques for improving performance on this task, but adding these features to our sites and apps can be very complex. Even great, free resources aren’t useful if you don’t have the time or skills to use them.

The good news is that we can take advantage of our human ability to analyze natural language and use really simple techniques to assist and amaze our users. I’ll explore a couple of ways to use these techniques in your own projects. These examples use web technologies but can be translated to other platforms and systems easily.

One of the goals of the Coral Project team while building Ask, a web product that enables news organizations to ask questions of their readers, was to build the form generation side of the project as an API.

Read Also:
IBM releases DataWorks to give enterprise data a home and a brain

One of the benefits of an API is that it allows developers to create their own integrations and user interfaces for creating and editing the forms. To showcase some possibilities, I built an alternative form creator targeting journalists and news devs who were setting up Ask for the first time.

When creating a form, it’s important to try and select the most appropriate UI input for the question. This helps the user understand how to complete the information, and it helps us understand the data. Since every question in the questionnaire needs a title, I thought it was the perfect scenario for applying a silly but effective NLP technique. The idea is simple:

I used Preact for writing this website (source code), just because I like over-engineering my experiments. But we can implement this easily with jQuery:

And of course, this is easy to implement because we didn’t show the hardest part: the algorithm.

If you want to get really advanced, try this: Before taking a look at this finished algorithm below, start creating a form yourself. Go to the first question of the form you’re working on, and see if you can figure out what the algorithm might look like. Once you’ve given that a try, check out how I implemented it:

Read Also:
8 Immersive Virtual Reality Data Visualizations

That’s it. That’s the way I modeled the English language for my use case. Even if you don’t know what a regular expression is, you can get an idea of how to implement your own model. In case you didn’t try it for yourself or my algorithm didn’t work for you, here is an animated GIF of what you should see:

Is this algorithm covering every possibility? No. Is this going to work in every case? No. But this function runs in microseconds in our user’s browser (I actually measured it); it’s really simple to implement; and it helps most of the users choose the right question type, saving time on form creation.

Once you have your script working, you may want to know the “success rate” which in this case can be something like: “What’s the percentage of cases where the model chose a different question type than the default, and the user didn’t change it?”

Read Also:
Data science in the cloud

How can you store all of these events? An easy way: if you are using an analytics solution, you probably get events for free. I usually send this event to Google Analytics where it’s easy to add the results and get the success rate. After all, this success rate is a measure of the behavior of your users on the site.

You can always improve your model by adding, modifying or removing rules. The good thing is that if our rules don’t detect the user input, you just didn’t help your user, and it will be a normal form—but the app still works as intended. The only thing that can really hurt are the false positives.

Read Full Story…


Leave a Reply

Your email address will not be published. Required fields are marked *