Graphie Graphs4Good in Medical Research: 5–Minute Interview with HealthECCO

Let’s hear it for HealthECCO and their groundbreaking medical research using Graphs4Good, which won the team a 2021 Graphie Award! We got the opportunity to speak with Dr. Alexander Jarasch, Head of Data and Knowledge Management at the German Center for Diabetes Research, and Martin Preusse, Co-Founder of HealthECCO, about their award-winning COVID Craft Initiative, which harnesses the power of graph technology to aid research in combating the seemingly endless pandemic.

These two powerhouses shared how their respective backgrounds have empowered their work in this space and revealed insights gleaned from their mastery of Neo4j over the years. They also share some exciting projections into what they see for the future of graph technology.

Hear it in their own words below, and congratulations to these brilliant minds working to make the world a better place.

Please introduce yourselves.

Alexander Jarasch: My name is Alexander Jarasch. I’m the Head of Data and Knowledge Management at the German Center for Diabetes Research. And together with Martin [Preusse] and other colleagues, we accepted the Kaggle Challenge and founded the COVID Craft Initiative. And in the German Center for Diabetes Research, as well as in COVID Craft, we’re using graph technology to tackle diseases and the complex network of diseases. My background is in bioinformatics with a PhD in biochemistry, and my main topic is managing data. And by managing data, also bringing people together.

Martin Preusse: My name is Martin Preusse. I have a background in biochemistry, so I didn’t start my career in computer science. While doing my PhD in computation and biology, I started using Neo4j. And I’ve been using Neo4j professionally ever since. I do consulting work in the pharmaceutical industry for medical research. And what I enjoy most is starting new projects. I’m usually involved in kicking off new projects and starting them from the very beginning. Workshops, trainings, and so on. That’s also what I do.

Tell me about how you’re using Neo4j at HealthECCO.

Alexander Jarasch: We are connecting data from COVID-19 related data – from patterns, from publications, as well as from the molecular world – to bring that together in one data source and make that available for researchers to find a cure for COVID-19.

What made you choose Neo4j?

Martin Preusse: I think we didn’t choose Neo4j. I think Neo4j is what brought the team together. So all of us have been using Neo4j for a long time. And there will be so many different reasons why the members of our team have been using Neo4j. So I don’t know. I started using Neo4j in 2013 or 14, I think, because it’s just a very good tool for data integration. This is how it started. And there will be so many different reasons, and everyone will have a different perspective. But the point is, Neo4j and the graph, the integration power of the graph databases – this is what brought the team together.

What have been some surprising results you’ve seen?

Alexander Jarasch: Yeah, that’s a good question. I think one of the basic answers to that is it’s so easy to integrate data and to connect data, let’s say, intuitively like we humans would draw that on a whiteboard. And before, we hadn’t been able to do that in this easy way, in this time manner. So I think this is one of the key results.

Second result is that we use the Graph Data Science Library to easily identify the most prominent gene that is relevant in COVID-19 or SARS CoV 2 viruses. And we could easily calculate it within seconds using the GDS library.

What are you able to do now that you weren’t able to do previously?

Martin Preusse: I think there’s a huge aspect of community and communication. There’s a lot of technical things, for sure, that we can do now, but I think a lot of very positive outcomes are about communication and community. Because we now have this huge knowledge graph. We brought so many people together from the biomedical field, from medicine, with different backgrounds in this area. And I think this is something that was enabled by this technology. So this is also something that we hadn’t been able to do before.

What is your favorite part about using Neo4j?

Alexander Jarasch: For me, that would be bringing data together intuitively. And by bringing data together, we immediately bring people together. So we connect them, and they speed up their processes, basically.

What do you think the future holds for graph technology?

Martin Preusse: Very good question. First of all, I think and hope the future is bright. I think everything that is happening in the graph data science area is really, really interesting. And here, I want to see a graph native learning at some point, right? So if this happens, we can really use the full knowledge we have in our graphs, in our growing graphs, in machine learning, and data analysis. So that will be a massive step forward, I think. So that’s going to be an exciting future.

How do you think graph technology can continue to serve medical research?

Alexander Jarasch: I think it will serve as the basic technology, as everything is connected by definition. And we have a complex network of diseases. We have a complex dependency. So that’s natively covered by a graph database. And when we compare it to the times of my PhD thesis [3D Modeling of Ribosomal RNA Using Cryo-electron Microscopy Density Maps, 2011], we didn’t have the capacity, and also the scalability wasn’t there. And now we have that. So we can now, one-to-one, map that in a graph database and do things with it.

How did you feel when you found out you were being honored?

Martin Preusse: Very excited, of course. So again, HealthECCO is about technology, about graphs, but it’s also about community and communication. So the award will hopefully lead to more exposure, which is very important, I think, because we try to bring people together and build a place where everyone working with graphs, with knowledge graphs, with Neo4j in the medical field can come together. And I think the Graphie Award will be tremendously helpful for that. And that’s why I was super excited.

Alexander Jarasch: And we also have a very big appreciation for all the people in the background who are not sitting here in the interview, who did a lot of work and spent a lot of time to support others, and also to connect to other people. So I think the appreciation to everybody is one of the coolest things. Yeah.

What are the first three words that come to mind when you think about Neo4j’s impact on medical research?

Martin Preusse: Things that come to mind about Neo4j’s impact on medical research. So again, this is a somewhat personal perspective, and I iterate on the same theme again and again. For me, the impact is actually not so much in technology, in inquiry performance and data integration. For me, again, the point is communication.

So by experience, when you integrate data into Neo4j, you bring people together. And Alex also mentioned that a couple of times, but this is what actually happens in real world projects. Yes, query performance is fantastic. It’s better and it scales well. But the actual power, in my opinion, is it brings people together because we can talk about data with less abstraction. It’s tangible. It’s visual. And just having a tool that brings people with an actual medical background, people with a biology background, together. This is a real world and huge impact, I think.

Alexander Jarasch: I couldn’t agree more. I would add one more thing here, that we don’t have with relational databases, or not so much. I think the database part and the visualization part is so close together with graph technology that it’s not there in the SQL world. I think that is one of the biggest impacts that graph technology will bring to the medical part.

What is one main takeaway you have from introducing graph technology to your company?

Alexander Jarasch: So for me, it would be to bring that into my actual organization in the diabetes field – we connect more and more things on the way, very intuitively, very, let’s say, easy, especially when we talk about unstructured and very heterogeneous data. I think this is the main takeaway for me.

Martin Preusse: One takeaway I would add is it helps people to realize that things can go fast. This is, I think, another thing that is generally, in real world projects, very nice and cool about Neo4j, the whole development process. It’s just faster. You can iterate more quickly. You can be agile in a positive way. And a lot of people who would never have touched Neo data projects, see that we can go faster. We are agile. We can produce results quickly. It’s not a five-year project. It’s something we can just do and try. So this is something I see a lot.

What is one graph trend you are following in 2022?

Alexander Jarasch: I will definitely follow more of the knowledge graph part. I think knowledge graphs will be more and more important as the data becomes more unstructured and more heterogeneous and bigger. And if we add to that – that we have to connect internal with external data – I think that makes a lot of sense to build a huge knowledge graph with each and every organization. And that will lead, in my opinion, to a better and quicker information retrieval.

Martin Preusse: On top of that, I will definitely follow graph native learning and all the developments around graph neural networks and their real world applications. When are they actually useful and better than other methods?

What other graph-based projects and initiatives do you think deserve some accolades? Get a head start on the 2022 Graphie Awards and get your nominations in today.

Nominate a Graph Superstar for 2022