GraphGist: A Simple Meta-Data Model for a Graph Database

by perival

Setting up a Meta-Data Framework

This GraphGist is a quick exploration of a simple meta-data administration which can be used to store the structure of the nodes and relationships in a graph database like neo4j.

The data that is set up here could be used in an application layer to provide tailored UI and validation, or could be used simply with Cypher queries as a kind of meta-data dictionary. One may argue that this is pushing too much structure into a graph database, but I think the concept is worth exploring. How well can you query a graph database if you’re not sure of the structure of the data?

This is a fairly quick stab at this, and it could certainly be taken much further to include more about properties, security aspects and much more. Mostly it was a learning exercise for me, and hopefully will generate some thoughtful criticism. Hopefully I haven’t made too many egregious mistakes!

The basic concept is that each node in the graph will have a NodeType, and that NodeType will be represented itself as an 'admin' node. Likewise each relationship will have a RelType and that RelType will also be represented as an 'admin' Node. Then we can further identify for each RelType what types of nodes it can be used with for both the Start and End. Conceptually the 'admin' model looks like this:

To actually realize the concept, I came up with the following model of nodes and relationships. (Note that in the diagrams, the node name is shown first, with the node’s NodeType below in square brackets).

Setting up new Meta-Data for an application

Once the admin infrastructure is in place, setup for any new desired NodeTypes and RelTypes can be done. In this example, assume we will have Person and Date nodes, and will need to be able to create relationships to support capturing a Date hierarchy, marriage ties, and birthdates.

The necessary setup will involve the creation of two new AdminNodeTypes (Person and Date), three new AdminRelTypes (DateIn, Spouse, and Birthdate), and the relationships necessary to link them together:

  • The Node Type Owner Owns Person and Date

  • The Rel Type Owner Owns DateIn, Spouse, and Birthdate

  • For DateIn, the StartNodeType is Date, and the EndNodeType is Date

  • For Spouse, the StartNodeType is Person, and the EndNodeType is Person

  • For Birthdate, the StartNodeType is Person, and the EndNodeType is Date

The result looks like this:

//All data Admin setup

CREATE (ntOwner {name:'Node Type Owner', type:'Admin', descr:'Owns all Node Types'})
CREATE (rtOwner {name:'Rel Type Owner', type:'Admin', descr:'Owns all Rel Types'})

CREATE (admin   {name:'Admin', type:'AdminNodeType'})
CREATE (adminNT {name:'AdminNodeType', type:'AdminNodeType'})
CREATE (adminRT {name:'AdminRelType', type:'AdminNodeType'})

CREATE (owns    {name:'Owns', type:'AdminRelType', descr:'Owns'})
CREATE (startNT {name:'StartNodeType', type:'AdminRelType'})
CREATE (endNT   {name:'EndNodeType', type:'AdminRelType'})

CREATE rtOwner-[:Owns]->owns
CREATE rtOwner-[:Owns]->startNT
CREATE rtOwner-[:Owns]->endNT
CREATE ntOwner-[:Owns]->admin
CREATE ntOwner-[:Owns]->adminNT
CREATE ntOwner-[:Owns]->adminRT
CREATE owns-[:StartNodeType]->admin
CREATE owns-[:EndNodeType]->admin
CREATE owns-[:EndNodeType]->adminNT
CREATE owns-[:EndNodeType]->adminRT
CREATE startNT-[:StartNodeType]->adminRT
CREATE startNT-[:EndNodeType]->adminNT
CREATE endNT-[:StartNodeType]->adminRT
CREATE endNT-[:EndNodeType]->adminNT

CREATE (person {name:'Person', type:'AdminNodeType' })
CREATE (date   {name:'Date', type:'AdminNodeType' })

CREATE ntOwner-[:Owns]->person
CREATE ntOwner-[:Owns]->date

CREATE (spouse    {name:'Spouse', type:'AdminRelType' })
CREATE (dateIn    {name:'DateIn', type:'AdminRelType' })
CREATE (birthdate {name:'Birthdate', type:'AdminRelType' })

CREATE rtOwner-[:Owns]->spouse
CREATE rtOwner-[:Owns]->dateIn
CREATE rtOwner-[:Owns]->birthdate

CREATE spouse-[:StartNodeType]->person
CREATE spouse-[:EndNodeType]->person
CREATE dateIn-[:StartNodeType]->date
CREATE dateIn-[:EndNodeType]->date
CREATE birthdate-[:StartNodeType]->person
CREATE birthdate-[:EndNodeType]->date

Now some sample queries using this data

Here’s a console for queries:

Running queries, preparing the console!

Get all valid NodeTypes

START o=node(1)
MATCH o-[:Owns]->n
RETURN n.name AS NodeType
ORDER BY n.name
Loading table...

Get valid RelTypes for each NodeType

START o=node(2)
MATCH o-[:Owns]->r-[:StartNodeType]->n
RETURN n.name AS NodeType, collect(r.name) AS RelTypes
ORDER BY n.name
Loading table...

Get valid Start NodeTypes for each RelType

START o=node(2)
MATCH o-[:Owns]->r-[:StartNodeType]->n
RETURN r.name AS RelType, collect(n.name) AS StartNodeTypes
ORDER BY r.name
Loading table...

Get valid End NodeTypes for each RelType

START o=node(2)
MATCH o-[:Owns]->r-[:EndNodeType]->n
RETURN r.name AS RelType, collect(n.name) AS EndNodeTypes
ORDER BY r.name
Loading table...

I did not explicitly connect each node to its NodeType via a Relationship, rather its just an implicit tie using the 'type' property on the node. Not sure if there would be benefit to using a relationship…​

Variations of these queries can be used in the validation of Nodes and particularly Relationships to ensure that they are playing by the rules! I’ve built a simple version of a generic UI (html/javascript) for nodes and relationships using PHP for all database access and validation.

Run
Table
Graph
Table!
Graph!
Error!
Loading