Building a Real-Time Recommendation Engine With Data Science

Building a Real-Time Recommendation Engine With Data Science

Building a Real-Time Recommendation Engine With Data Science

What we’re going to be talking about today are data science and graph recommendations:

I’ve been with Neo4j for two years now, but have been working with Neo4j and Cypherfor three. I discovered this particular graph database when I was a grad student at the University of Texas Austin studying for a masters in statistics with a focus on on social networks.

Real-time recommendation engines are one of the most common use cases for Neo4j, and one of the things that makes it so powerful and easy to use. To explore this, I’ll explain how to incorporate statistical methods into these recommendations by using example datasets.

The first will be simple – entirely in Cypher with a focus on social recommendations. Next we’ll look at the similarity recommendation, which involves similarity metrics that can be calculated, and finally a clustering recommendation.

The following dataset includes food and drink places in the Dallas Fort Worth International Airport, one of the major airport hubs in the United States:

We have place nodes in yellow and are modeling their location in terms of gate and terminal. And we are also categorizing the place in terms of major categories for food and drink. Some include Mexican food, sandwiches, bars and barbecue.

Read Also:
3 Advantages of Using Neo4j Alongside Oracle RDBMS

Let’s do a simple recommendation. We want to find a specific type of food in a certain location in the airport, and the curled brackets represent user inputs which are being entered into our hypothetical app:

This English sentence maps really well as a Cypher query:

This is going to pull all the places in the category, terminal, and gate the user has requested. Then we get the absolute distance of the place to gate where the user is, and return the results in ascending order. Again, a very simple Cypher recommendation to a user based just on their location in the airport.

Let’s look at a social recommendation. In our hypothetical app, we have users who can log in and “like” places in a way similar to Facebook, and can also check into places:

In the above app, we also have users with “likes” relationships to a place node and who are also friends with other users. Consider this data model on top of the first model that we explored, and now let’s find food and drink places in the following categories closest to gate in whatever terminal that user’s friends like:

Read Also:
Why Higher Education Institutions Are Failing With Data

The clause is very similar to the clause of our first Cypher query, except now we are matching on places:

The first three lines are the same, but for the user in question – the user that’s “logged in” – we want to find their friends through the relationship along with the places those friends liked. With just a few added lines of Cypher, we are now taking a social aspect into account for our recommendation engine.

Again, we’re only showing categories that the user explicitly asked for that are in the same terminals the user is in. And, of course, we want to filter this by the user who is logged in and making this request, and it returns the name of the place along with its location and category. We are also accounting for how many friends have liked that place and the absolute value of the distance of the place from the gate, all returned in the clause.

Now let’s take a look at a similarity recommendation engine:

Similarly to our earlier data model, we have users who can like places, but this time they can also rate places with an integer between one and 10. This is easily modeled in Neo4j by adding a property to either the node or the relationship.

Read Also:
Want Predictive Analytics? You Need Data Harmonization First

This allows us to find other similar users, like in the example of Greta and Alice. We’ve queried the places they’ve mutually liked, and for each of those places, we can see the weights they have assigned.

 



Chief Analytics Officer Europe

25
Apr
2017
Chief Analytics Officer Europe

15% off with code 7WDCAO17

Read Also:
The Internet of Things: a Surveillance State in Disguise

Chief Analytics Officer Spring 2017

2
May
2017
Chief Analytics Officer Spring 2017

15% off with code MP15

Read Also:
3 Advantages of Using Neo4j Alongside Oracle RDBMS

Big Data and Analytics for Healthcare Philadelphia

17
May
2017
Big Data and Analytics for Healthcare Philadelphia

$200 off with code DATA200

Read Also:
What Will the Retail Store of the Future Look Like?

SMX London

23
May
2017
SMX London

10% off with code 7WDATASMX

Read Also:
Data Drives the Evolution of Movement Technology

Data Science Congress 2017

5
Jun
2017
Data Science Congress 2017

20% off with code 7wdata_DSC2017

Read Also:
3 Advantages of Using Neo4j Alongside Oracle RDBMS

Leave a Reply

Your email address will not be published. Required fields are marked *