Fraud detection in retail with graph analysis

Fraud detection in retail with graph analysis

fraud detection is all about connecting the dots. We are going to see how to use graph analysis to identify stolen credit cards and fake identities. For the purpose of this article we have worked with Ralf Becher, irregular.bi. Ralf is Qlik Luminary and he provides solutions to integrate the Graph approach into Business Intelligence solutions like QlikView and Qlik Sense.

Third party fraud occurs when a criminal uses someone else’s identity to commit fraud. For a typical retail operation this takes the form of individuals or groups of individuals using stolen credit card to purchase high-value items.

Fighting it is a challenge. In particular, it means having a capability to detect potential fraud cases in large datasets and a capability to distinguish between real cases and false positives (the cases that look suspicious but are legitimate).

Traditional fraud detection systems focus on threshold related to customers activities. Suspicious activities include for example multiple purchases of the same product, high number of transactions per person or per credit card.

Graph analysis can add an extra layer of security by focusing on the relationships between fraudsters or fraud cases. It helps identify fraud cases that would otherwise go undetected…until too late. We recently explained how to use graph analysis to identify stolen credit cards.

For the this article, we have prepared a dummy dataset typical of an online retail operation. It includes:

To analyse the connections in our data, we stored it in a Neo4j, the leading graph database. The graph approach lies in modelling data as nodes and edges. Here is a schema of our data represented as a graph:

Now that the data is stored in Neo4j, we can analyse it.

First of all we need to set a benchmark for what’s normal. Here is an example of a transaction:

Now that we have an idea of what not to look we can start thinking about patterns specifically associated with fraud. One such pattern is a personal piece of information (IP, email, credit card, address) associated with multiple persons.

Neo4j includes a graph query language called Cypher that allows us to detect such a pattern.

Share it:
Share it:

[Social9_Share class=”s9-widget-wrapper”]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

You Might Be Interested In

Why AI researchers like video games

14 May, 2017

LAST year Artur Filipowicz, a computer scientist at Princeton University, had a stop-sign problem. Dr Filipowicz is teaching cars how …

Read more

When a doll rats out a parent: Tech firms struggle with thorny privacy issues

4 Feb, 2017

The rise of intelligent devices, from wearables to smart home sensors to Internet-connected Barbie dolls, is confronting technology companies with …

Read more

How Pig and Hadoop fit in Data Processing Architecture

2 Apr, 2017

Hadoop and its ecosystem has evolved from a narrow map-reduced architecture to a universal data platform set to dominate the …

Read more

Do You Want to Share Your Story?

Bring your insights on Data, Visualization, Innovation or Business Agility to our community. Let them learn from your experience.

Get the 3 STEPS

To Drive Analytics Adoption
And manage change

3-steps-to-drive-analytics-adoption

Get Access to Event Discounts

Switch your 7wData account from Subscriber to Event Discount Member by clicking the button below and get access to event discounts. Learn & Grow together with us in a more profitable way!

Get Access to Event Discounts

Create a 7wData account and get access to event discounts. Learn & Grow together with us in a more profitable way!

Don't miss Out!

Stay in touch and receive in depth articles, guides, news & commentary of all things data.