At the University of Washington, the IT team serving the business tried more than one tool to provide end users with a way to find out about all the data at their disposal. Those tools failed at connecting all their metadata and handling the ever-changing schema of the university’s data. To serve end users, the team built its own metadata tool using Neo4j.
In this week’s five-minute interview (conducted at GraphConnect New York) we discuss what inspired Pieter and his team to build this tool, as well as the many uses they are finding for metadata.
Talk to us about how you guys use Neo4j at the University of Washington.
Pieter Visser: I work for the University of Washington’s IT department. We’re a unit that’s specific to the business side of the university, not the student side. There are a lot of people using Neo4j for research, but that’s not what we do.
We use Neo4j in a couple different ways, but the main way we’re using it is as a metadata repository that stitches together information for the enterprise data warehouse and our BI tools.
With metadata, by definition, everything’s different yet everything’s connected. That’s what makes it interesting. It’s how a table is connected to a term, how a column is connected to a report or how’s it being used.
When we tried to do that on a relational database, trying to connect those relationships is almost impossible. You think you have it defined, and all of a sudden, someone says now we want to add a cube to this with this many dimensions, and I want to connect that dimension to a different term. Relational databases can’t do that.
What made you choose Neo4j?
Visser: We tried a couple of different tools. We purchased a cloud-based tool, and we purchased some other tools, and none of them really worked right for us. I feel like it’s the Swiss Army knife kind of thing. You get a Swiss Army knife, but if you ever try to use it as a screwdriver, you just want a real screwdriver or a real hammer. We decided we wanted to make our own metadata tool.
Neo4j gave us the ability to basically connect any node to any other node and then show that visually. It gives us the context of our metadata.
What else made Neo4j stand out?
Visser: Our main goal was to make it easy for the end user. Metadata tools, by definition, are usually written for metadata managers. They’re really just not easy to use. While metadata managers love them, end users say, “I don’t understand this at all.” With Neo4j, end users may quickly visualize and get context for metadata. It is fantastic.
Can you talk to me about some of your most interesting or surprising results you’d had while using Neo4j?
Visser: I think what is interesting now is, even outside of that project, to just look at that data, and see what we can do with the data that we’ve collected. We mix it with other data.
For example, we mix it with security information and create a semantic discretionary access control (DAC) layer. Or we mix it with organizational structure and do report recommendations based on org structure. So it’s a lot more than just the metadata as you start mixing in all the kind of data sources, and it tells a different story based on the same data.
If you could start over with Neo4j, taking everything you know now, what would you do differently?
Visser: I think I would change our data model. We added versioning right at the beginning. That complicated things significantly for us because, all of a sudden, you can no longer just traverse your nodes. You can’t simply say, “Go from this table to that table.” You have to say, “Go from this version of this table to that version of that table.” And that complicated all our Cypher queries tremendously.
That’s kind of the bread and butter of Cypher, right? Just to quickly say, “Show me the shortest path to that thing.”
If you have versions, you say, “From this active version to that active version.” And if I want to trace the lineage of something, I can no longer just traverse my trail. I have to traverse the trail and then figure out which version of this trail to use to go to the next one.
That’s not a Neo4j thing. It’s just that the data model that we designed complicated things and probably went against the best way to use Neo4j.
What do you think the future of graph technology looks like for metadata?
Visser: It’s about the UI. I know we saw some tools for UI this morning, but I don’t really think it’s enough yet. I feel there will have to be much better UI tools that allow the end user to do an analysis.
Today we saw a textual Cypher query, something that converts text to Cypher. The user still has to understand the context, and they have to understand what they’re asking if they use it that way.
I would like to see a much more visual way to query. And I don’t mean by graph, going to from this node to that node. Imagine you had Cypher snippets on the left and you say, “Well, I’m going to drag these snippets and connect them and get a brand-new result from that.” And then you say, “This snippet is everything I want to exclude. And this one is what I want to use to boost my results.”
Those are the kind of tools that will be much more user-friendly for my end users.
Want to share about your Neo4j project in a future 5-Minute Interview? Drop us a line at firstname.lastname@example.org
Want to learn more on how relational databases compare to their graph counterparts? Get The Definitive Guide to Graph Databases for the RDBMS Developer, and discover when and how to use graphs in conjunction with your relational database.
Get the Ebook
Get the Ebook