Hi all,

Today, for my lab project, I decided to model an in-graph index in Neo4j and query it with the Cypher Query Language.

The basic problem we try to solve here is the ordering of events in a timeline and asking for ranges of events ordered in time without needing to load the whole timeline, or let an external index like Lucene doing the sorting (which is very costly). So, a simple approach to do this is a multilevel tree, where you attach the domain nodes to the leafs of the index tree and query by traversing through that structure.



Now, to ask for all Events between 2011-01-01 and 2011-01-03 you simply find the starting and ending path (in this case they share the upper part of the tree) for these levels in the index, and then collect the Events hanging off the Day-nodes ordered via the NEXT relationships, following the VALUE relationships, if they exist.


All these five segments of the query structure can be expressed in one single Cypher query:

START root=node:node_auto_index(name = ‘Root’)
MATCH 
  commonPath=root-[:`2011`]->()-[:`01`]->commonRootEnd,
  startPath=commonRootEnd-[:`01`]->startLeaf,
  endPath=commonRootEnd-[:`03`]->endLeaf,
  valuePath=startLeaf-[:NEXT*0..]->middle-[:NEXT*0..]->endLeaf,
  values=middle-[:VALUE]->event
RETURN event.name
ORDER BY event.name ASC

Returning Event2 and Event3. This may seem surprising at first, since we’ve asked for the middle events, but notice that variable length path [:NEXT*0..] includes length 0 and has no upper limit. Because the startLeaf and endLeaf are bound through the previous path definitions, they will be the boundaries of the range.  

Some more examples on this data structure are available as part of the Neo4j Manual in the Cypher Cookbook section.

Happy hacking!

/peter


 

Keywords:  


15 Comments

pehrlich says:

This is interesting, I like it. I imagine if you wanted to query for the next 20 days, rather than from a date range, you could do this:<br /><br />START root=node:node_auto_index(name = &#39;Root&#39;)<br />MATCH <br /> commonPath=root-[:`2011`]-&gt;()-[:`01`]-&gt;()-[:`01`]-&gt;,<br /> valuePath=startLeaf-[:NEXT*0..]-&gt;middle-[:NEXT*0..20]-&gt;endLeaf,<br /> values=middle-[:VALUE]-&gt;

Exactly, very nice use of the potential of Cypher IMHO :)

bww00 says:

Thanks!<br />I have a problem for a clinic setting, where there are multiple patients with multiple events per interval (say 1 – 15 minutes), and multiple state changes for an event. Would it be better to have tree for each patient, or a common tree for all ? What impact would the granularity of the time be, down to the minute or second ?<br /><br />Regards,<br />Bryan Webb

bww00 says:

Thanks!<br />I have a problem for a clinic setting, where there are multiple patients with multiple events per interval (say 1 – 15 minutes), and multiple state changes for an event. Would it be better to have tree for each patient, or a common tree for all ? What impact would the granularity of the time be, down to the minute or second ?<br /><br />Regards,<br />Bryan Webb

@bww00 depends a bit on the amount of data. How big is the total number of events going to be?

@bww00 I think you should just create a testcase and try it out. It sounds like doable with one big tree for all events, but I wouldn&#39;t do that without performance testing.

panisson says:

Hello Peter, thank you for this nice example.<br />We are trying to use this approach to represent a timeline of frames, and a graph on each frame. However, we are using the Neo4j <a href="http://api.neo4j.org/current/org/neo4j/graphdb/Node.html#createRelationshipTo(org.neo4j.graphdb.Node,%20org.neo4j.graphdb.RelationshipType)&quot; rel="nofollow">createRelationshipTo</a> method to create the

Hi think the key here is to use DynamicRelationshipsType.withName(&quot;MONTH_01&quot;) which will let you create dynamic relationship types.

Thanks Peter, I was not aware of the availability of the DynamicRelationshipsType class.

By the way, there&#39;s any performance advantage in using relationship types in place of relationship properties?

Simon Hallé says:

Hi Peter, thanks for the example, looks pretty good! Although I haven&#39;t done much tests with Lucene, I was wondering why you noted that using such index would be &quot;very costly&quot; to do the sorting? In my situation, I need to implement a filter that returns all Nodes within a date range. Nodes are tagged with a start and optional end date (node properties). I noticed that

bww00 says:

peter,<br />what would be the fastest way to create a test graph.<br />I also need to add a physician node with a relation to the event and a function node with a relation to the event.<br />I would need to query the events by date/time. I will need to query events by physician and also by function.<br /><br />I am having trouble getting started in creating the graph and relationships.<br /><br /

Unknown says:

Thanks Peter, for a nice explanation. I&#39;m seriously looking at the graph databases now for our reporting suite – I have tried to model the time series data as you had suggested but since it goes down up to the seconds level I&#39;m quite conscious about performance. I have devices sending the data so I can append to the timeline graph with seconds precision. These devices are tied to a client

Roman Rytov says:

This is nice but only for the cases when a node is defined for in a timeline once and for all. What if the relationship is time-dependant? Reporting lines, period of studies, marriage, place of living, etc. How to model then relationship right to perform effectively? Create &quot;since&quot; and &quot;until&quot; attributes and run subfilters? What&#39;s the right way?

flip101 says:

@bww00 Can you share your data model? I would like to make a similar setup. I have the same question about: Where to introduce several &quot;users&quot; into this model. Have you gotten any test results back?<br /><br />@Unknown I don&#39;t think you have to specify the frames to the smallest unit of time. You can have an hour as smallest frame and then attach all those events to this hour. But

Leave a Reply

Your email address will not be published. Required fields are marked *

Related Posts

Popular Graph Topics

Archives

Have a Graph Question?

Reach out and connect with the Neo4j staff.
Stackoverflow
Contact Us