As the Vice President of Data Science for a major IT firm, Dr. Kateryna Nesvit, Ph.D. architected a data science platform that predicts the appropriate levels of talent compensation for more than 15,000 IT companies. For 11 years, she has taught university courses in applied and computational mathematics, including numerical analysis, data science, knowledge graphs, data analysis, and recommendation systems. She has authored 65 published papers, and holds one scientific patent.
In her upcoming talk at Neo4j’s NODES 2022 virtual conference, entitled “Discover Invisible Patterns in Your Data,” Dr. Nesvit will demonstrate how folks can take the same critical steps she did in her data science career, by connecting Google Sheets to Neo4j graph databases for the first time. In an interview with Neo4j, she shares more about her background, and how graph databases served as a platform for her accomplishments.
Dr. Kateryna Nesvit: Data science is a very creative role, I would say. You have the freedom to create algorithms that will satisfy business needs. . . to be open-minded, be creative, and solve some problems that still exist and are still not solvable.
I’m a Ukrainian mathematician who grew up in an academic family. I got my Bachelor’s, Master’s, and Ph.D. degrees in numerical math and computing. When I was 18, I started to help scientists with research on influencing the size and shape of dental implants. Two years later, I joined my first conference in Europe in Innsbruck, Austria. The same year, my first article was published. Since then, I’ve just been really passionate about creating [algorithms] and using data to help people stay healthy and productive.
The first project I worked on was a social network. The director wanted to create a unique recommendation system. For me, who was not using social media, it was kind of challenging. But, from that perspective, it helped me to develop what should be inside that I would use.
The next project I joined was as a vice president of data science. They had data, but it was very sparse, very diverse. . . but the ambitions were just huge. They wanted to help thousands of companies, but we have this [tiny] piece of data. It was complicated, I would say. My team worked together as scientists to create algorithms. Everyone was staying creative, and I can say that we ended up with good achievements. We finally released the app. It’s now up and running. Moreover, this challenge concluded with a scientific patent that we created on the algorithms.
Jennifer Reif: You have the heart of a practitioner with a passion for sharing tech with others. What led you to teach others about technology?
Dr. Kateryna Nesvit: I think this all comes from my parents. My parents are teachers, and I would say that they are the best teachers in the world. The way they teach where I understand every single detail of technology, of mathematics. My mother is a mathematician, my father is in technology, so I’m a combination of those. That helps me create a way to explain something where other people will understand. How I can approach that person or another person with that level of knowledge to come through the difficulties or solve the problem. And always, technologies help us to solve the problems. I think technology is a tool, and without it, we cannot do much.
When I was getting my master’s degree, there was an opportunity to get an additional specialization to become a teacher and lecturer. From that point, I started to think, “Yes, this is something where I want to share my knowledge. I want to share what experience I already have.” Since then, it’s already been eleven years that I’ve been teaching at different universities. I like mostly how students (and not only students, but also my colleagues), when I explain something, and then their eyes light up. “Yes, now I can do it. Now I can do it!”
Graphs Enter the Picture
Jennifer Reif: When and how did you first encounter graph databases?
Dr. Kateryna Nesvit: It was in 2015 when I read a book on SQL. I saw a comparison with Neo4j where the time of execution in the queries was many times faster. Another chapter in the book said that an SQL query for one million users and the fifth depth of complexity, would not finish at all. And I thought, “What? And this is a tool? This is a technology?” Then for four steps of complexity in SQL – not million, one thousand – 92 seconds [to complete]. And Neo4j, less than 1 second. This is a huge difference.
At that point, I started to Google everything about Neo4j. I started to work, and it immediately triggered me to use that technology. So that was a very good starting point.
Jennifer Reif: So nobody introduced you to graphs. You just kind of picked it up yourself and went from there?
Dr. Kateryna Nesvit: Yes. I think the book covered seven different databases — a general book. It was not anything specific. But when I saw the comparison of what’s in practice, then a graph database was definitely the choice that I wanted to make.
The Integration Challenge
Jennifer Reif: Was there a particular project or problem you encountered that motivated you to use graph?
Dr. Kateryna Nesvit: Yes. I had a project where I needed to build a recommendation system for users and how much they would like the content. It was similar to Instagram, but allows you to give some kind of reward for the media content. The activity screen is unique in that it looks like circles, and each size represents how much you like that content. Since each user is unique and individual, the recommendation system also should be unique and individual. At that time, I was searching for some tool and Neo4j came up through the book and that kind of made the choice to use Neo4j.
Jennifer Reif: So you weren’t using anything prior to that. You just had this problem, and you thought it might be a good fit for graph?
Dr. Kateryna Nesvit: Exactly. I can say mostly people start with tables and SQL. I started with Neo4j. Maybe I just read a lot about the others and what the problems were, so I just said “Ok. I will look around, but I will start here.”
Jennifer Reif: Was there something really difficult when you started incorporating graph into that project?
Dr. Kateryna Nesvit: At that time, it was 2015, and there were not too many things in the cloud. I remember when I was deploying manually from the command line. It was not like now where we have all the settings, and we just click on a button. So it was a little bit complicated for me. I’m not a computer scientist and don’t do much programming, so for me at the time, it was a challenge.
Jennifer Reif: Do you have some other current projects that you’re using with Neo4j, as well?
Dr. Kateryna Nesvit: Right now, I do have some projects. I do work for a medical company that is using some devices, and they predict how their people will be cured and where medications are. So this is also the type which I can clearly see will be a recommendation system. It will be the graph. The transition is always challenging, and this transition is happening right now.
Also, in teaching, I’m currently a visiting professor at Marymount University, and I teach statistics and data science where I use Neo4j as a graph database. If it is building an application, then we use it to simulate the data. If it is statistics, it looks very natural for students to see bubbles, to see what’s related, so they are a little bit more inspired. It’s not the whole course, but a little piece. Their eyes just light up, and wow!
Jennifer Reif: It’s fantastic you’re finding ways to incorporate graphs into things like statistics, which a lot of people wouldn’t associate. They might think that it’s just a math class, why would they need to incorporate databases? You’re finding a way to use graphs as a learning tool to help your students understand mathematics better.
Dr. Kateryna Nesvit: Yeah, I just shift the priorities. Yes, this is a database, but how is it mapped to statistics? Statistics is basically some measurement. And I found measurements we can use in Neo4j. Basically, when we analyze in Neo4j Bloom [especially when we have a feature] to change size or color. In order to get to that final point, would you upload your data? Would you change the size? Can you change this range? This is statistics. That’s how we can shape something that we need to show.
Dr. Nesvit at NODES 2022
Jennifer Reif: Could you tell us a bit about the inspiration for your NODES presentation about connecting Google Sheets to Neo4j?
Dr. Kateryna Nesvit: It was very unexpected. I got a message from Michael Hunger, who invited me to give a talk. It was like fresh air to my daily routine this year. It’s been a very difficult year for me, and my initial tentative talk was about Neo4j graphs in education, research, industry and how to make an impactful decision. It was very ambitious, very global, and very wide. Next, Michael gave a recommendation to narrow down the topic. I decided to talk about probably the most painful things: tables and graphs.
Jennifer Reif: And you were using Google Sheets and connecting that to Neo4j? Or did this come about because of the topic narrowing down for NODES?
Dr. Kateryna Nesvit: The Google Sheets [angle] came because I think it’s most often used in our daily lives. So, the most complicated thing for people is converting their vision — their view of representing the tables in their minds — into the graph view. And sometimes people want to try, but they’re scared to do the first step. There are a lot of recommendations. There are thousands and thousands of tutorials and videos, each thing valuable. To make an actual step, it’s complicated. I found it’s not easy to take a breath and really step in and practice doing. So just practice and start to do it. Start to convert these tables to a graph and see what you didn’t see before.
Jennifer Reif: You mentioned that people had the most trouble knowing where to take that first step with graphs. What do you hope attendees learn or discover through your presentation?
Dr. Kateryna Nesvit: What I hope is that at least 30 percent of my listeners will go out from their work and actually start to practice and convert the table data to a graph and make sense of data – to derive better decisions, to make things better, more efficient, more healthy. I want to show that it’s not difficult. I want to show that our daily Google Sheets can be connected in seconds, and in another few minutes, you will be able to see something you never saw before in the table. This is my goal for that talk. Just want to show, “This is easy. You can do it. Anyone can do it.”
NODES 2022 is a free, online, worldwide graph tech conference taking place November 16 and 17. The 24-hour agenda will be packed full of beginner-, intermediate-, and advanced-level content for technologists and graph-lovers. For those interested in attending Kateryna’s session (like myself!), register for the event at neo4j.com/nodes-2022 and catch her presentation on November 16, 21:20pm GMT / 16:20 ET / 13:20 PT.