Not Everyone Is a Data Scientist: 5-Minute Interview with Sony Green


“Graph is a great starting point because it really is much closer to the way humans think about things than tabular data,” said Sony Green, director of business development and cofounder of Kineviz.

Kineviz’s GraphXR is a browser-based visualization tool that brings speed, power and flexibility to the exploration of data.

In this week’s five-minute interview (conducted at GraphTour SF 2019), we discuss how Sony Green and his team use Neo4j with GraphXR.



How does your product use Neo4j?


Sony Green: GraphXR is a visual analytics tool. A lot of times when people think about data visualization, they think of an output: You’ve done your analysis and here is what you show at the end. For us, graph or data visualization is a way of exploring and analyzing the data and learning things that would be not necessarily accessible just by looking at the numbers themselves.

It was really a natural partnership for us with Neo4j, given that Neo4j has defined the graph space. We were actually already using graph for visualization. So it was a perfect union to be able to take advantage of Neo4j’s technology. The labeled property graph is a really powerful model.

A lot of times I’ll talk to someone about Neo4j or graph databases and people think of it as sort of a specialized tool or niche tool, but it really it has such a wide range of possibilities, it addresses so many needs and in our view it is kind of what a relational database should be, actually delivering on the promise of the relational database.

What Neo4j enables us to do in GraphXR is to treat data in a different way. So it’s both the structure of the graph but then the underlying statistical data that you can work with in the same context.

What made you choose Neo4j?


Green: What Neo4j offers is an incredible level of flexibility and power, and GraphXR was built to really capitalize on that.

You encounter two different approaches to graph analysis. There are people who are starting with a really big dataset and trying to hone in on a particular topic of interest, and people who are starting out with one node and trying to trace connections to something else or find a pattern that it’s a part of.

Bringing together disparate datasets is an important part of those analyses. That’s something that we have really tried to make as easy as possible within GraphXR. You can bring a flat CSV in and link it up to an existing Neo4j dataset or another CSV and do dynamic modeling within the visualization tool, which is something we haven’t seen a whole lot of other people do.

Where are you seeing interest in this type of graph visualization and exploration?


Green: We work with companies in a lot of different spaces: law enforcement, healthcare, business intelligence use cases. We are able to support this range of use cases because graph is such a universal way of modeling data.

One example is NIH. They are launching their bioinformatics portal through NIAID, the National Institute of Allergies and Infectious Diseases.

The portal is built on GraphXR. It allows a researcher, starting out with whatever topic they are focused on, to be able to find relevant documents and explore connections that might not be obvious. You might have an author who has written on your topic and another topic and there’s actually some underlying science connecting the two that you may not have been aware of. The bioinformatics portal allows you to identify those connections.

Part of the reason that NIH came to us in the first place is because they are using Neo4j as the underlying platform for this data set. They’re exploring different uses of it beyond simply the knowledge management aspect, being able to trace the spread of an epidemic across the map or seeing the time series activity.

Is there another area you’d like to highlight?


Green: One area where we’ve seen a lot of interest is in the broad law enforcement and cybersecurity arena. There’s a whole discipline, OSINT, open-source intelligence, where they’re using social media and other publicly available data in order to try and identify terrorist threats, human trafficking, things of that nature. And for that, graph is a perfect metaphor.

We’re working with a school in the Netherlands, the International Anticrime Academy. They are the first official GraphXR trainer in the world.

The Anticrime Academy works with law enforcement. They also work with private institutions, banks, insurance companies, looking for fraud rings – and it’s really interesting to see sort of the range of data sources that can inform that kind of analysis.

If you look at payment behavior you might see multiple people using the same credit card, for example. But it might not be something so obvious; it might be that there’s a lot of orders for the same thing coming a single IP. It could be that someone is masking their IP. We’ve seen examples where people are acting from all over the globe because they’re masking their location. Maybe there’s nothing going on there; maybe there is.

The Neo4j Twitter Troll Sandbox was one of the first datasets that we used as an example for GraphXR. It’s a great point of reference in terms of a really complicated misinformation campaign and being able to see the strategies being used there.

One of the things that’s particularly interesting about what the Russian Internet Research Agency (IRA) did was that they have misinformation coming from multiple angles. You had trolls were both posting very right-wing messages, but then other user personas that were coming from a left-leaning position, but with a very inflammatory set of messages really designed to create as much friction as possible, and unfortunately it was very effective.

What is your experience of partnering with Neo4j?


Green: One of the reasons that we love working with Neo4j is because their approach is so much bigger than just Neo4j. They are graph evangelists; they’ve created the space and they are helping everyone who is in the space. And we’d like to emulate that as much as possible from a graph visualization standpoint and show people what the potential of visualization is.

A lot of people look at visualization as kind of a nice-to-have or a thing you do once all the actual analysis is done. For us the graph is the engine of the analysis, so you can you can explore your data.

And a big challenge for analysis is knowing what questions you want to ask. You have a dataset and you don’t necessarily know what you’re looking for. A lot of the time people think of graph as sort of this static thing that you have your data structure, your schema and now you just need to show it.

The fact of the matter is, depending on the question, you might need a very different structure to analyze it. Having dynamic modeling capability and being able to work with data in a really fluid way enables you to ask different questions without having to go back to the drawing board every time.

Is there anything else you’d like to share?


Green: One thing for me that’s really important is the notion of data literacy. I don’t come from a data science background. I went to school for sculpture and then worked in the video game industry. I look at the world, you know, the business environment, education, in so many different arenas, data is an integral part of decision-making, and not everyone is a data scientist, not everyone wants to be a data scientist.

And there needs to be a way for people to interact with data without having to go through that kind of training. And I think that graph is a great starting point because it really is much closer to the way humans think about things than tabular data.

One thing that’s kind of interesting about Kineviz is that we’re actually a spin-off from an arts nonprofit. Our CEO Weidong Yang started Kinetech Arts in 2014, which is devoted to the intersection of performance, dance, and technology, coming from a highly visual and highly emotional approach to looking at data.

So the notion of feeling data may sound kind of touchy-feely, it is, but making data accessible and making insight rapidly available, that’s a very real need that graph allows us to address.

Want to share about your Neo4j project in a future 5-Minute Interview? Drop us a line at content@neo4j.com


Curious about using graphs in your business?
Download this white paper, The Top 5 Use Cases of Graph Databases, and discover how to tap into the power of graphs for the connected enterprise.


Read the White Paper