Predicting Fraud: 5-Minute Interview with Marius Hartmann


“There’s a risk that your fraud detection becomes anecdotal because you become accustomed to certain addresses or names. You risk emphasizing the small-time fraudsters instead of the clever ones. The clever ones are the ones that don’t usually get caught,” said Marius Hartmann, Chief Advisor, Danish Business Authority.

The best time to stop a fraudulent business is at the very beginning, when the business registers with the government. That’s the role of the Danish Business Authority, where the team uses graph data science to predict fraud and stores detailed data about every decision in a graph database for later review.

In this week’s five-minute interview (conducted at GraphTour NYC 2019), we speak with Marius Hartmann, Chief Advisor, Danish Business Authority, about how they use graph data science to fight fraud.



How do you use Neo4j at the Danish Business Authority?


I lead a small unit of data scientists and we are working with machine learning in relation to graph technology to fight fraud. So we use it for fraud detection, but as an outcome of that task, we’re also using it for personalization in order to provide a better service to our clients

How were you solving this problem before?


Well, I learned some hard lessons during some projects concerning telemetry off of some of our formulas and things like that. We have some great metadata from that. We also learned the hard lesson of joining tables when you get beyond a certain degree. So I think the reason that we are using graph technology now is that some of the business goals have actually changed from the initial goals of digitization when you wanted to add data more easily or you wanted to sum up things and get the average. Now you’re more interested in the individual data point and its immediate surroundings.

What about graphs specifically that has helped you?


Some of the questions that we needed to answer included whether a company that’s formed is liable to commit fraud. Is this registration being done with the intent to commit fraud? And that’s a really hard question to answer because you need to predict a lot of things.

So we needed to provide a context for our models from which to make those evaluations.

Marius


How has the project expanded since you first started?


It has affected a lot of our infrastructure. Of course we’ve introduced graph technology, but also some of the sidekicks to machine learning and the task at hand in terms of an event-driven architecture that has also formed part of our metadata strategy of providing explanation traceability to our machine learning models in that we simply keep the events in a graph as well. So we get a full historical view of all the decisions we’ve ever made and all the data transformations and store it in Neo4j.

What surprising results have you seen from using Neo4j?


I think the interesting results are really some of the questions that we are now able to formulate. And that also sheds some light on some of the shortcomings of the usual methods of detecting fraud. There’s a risk that your fraud detection becomes anecdotal because you become accustomed to certain addresses or names.

And the problem with that is that you risk emphasizing the small-time fraudsters instead of the clever ones. The clever ones are the ones that don’t usually get caught.

When you have them in a graph, suddenly they’re connected.

Where do you see the project going from here?


We’ll be looking immediately into explainability. We did a proof of concept of the explainability pattern in 2018. And before we launch the next major anti-fraud model, I wish to make damn sure that we’re able to explain why we dissolved a company or advised a human being to consider dissolving that company. Of course we’re not doing automatic decision-making. It’s still a human in the loop situation.

Any advice for those who are just getting started with graphs?

I think the major hurdle in introducing graphs is explaining it to management. So, keep it far away from the technical side.

Take a small part of your data and just build the graph and show it to them. Show that you can connect any point in your data. And that will convince management that you can actually deliver value. So shortest path; it’s very easy to understand.

Want to share about your Neo4j project in a future 5-Minute Interview? Drop us a line at content@neo4j.com