Understanding Neo4j Query Plan Caching

This article is based on the behavior of Neo4j 2.3.2. Query plan caching is governed by three parameters, as defined in the conf/neo4j.properties file, which are detailed here. The three parameters which govern whether a Cypher statement is planned/replanned are:… Read more →

How to check for time range overlap in Cypher

Neo4j 3.4 introduced temporal types into Cypher, so now we have dates, dateTimes, and their local versions, too, as well as durations. While we don’t have a type for time ranges, we can use two temporal instants as the start… Read more →

Limiting MATCH results per row

Since LIMIT applies to the total number of rows of the query, it can’t be used in cases when matching from multiple nodes where the limit must be per row. Take an example case using the Movies database. If you… Read more →

Large Delete Transaction Best Practices in Neo4j

In order to achieve the best performance, and avoid negative effects on the rest of the system, consider these best practices when processing large deletes. Start by identifying which situation you are in: Deleting the entire graph database, so you… Read more →

Working with streaks in Cypher

When using Cypher for data analysis, you might have a problem where you need to identify or filter based upon some kind of streak. For example, for a sports graph, you might want to know the maximum number of consecutive… Read more →

Access to the neo4j-shell in NEO4J CE 3.x

From Neo4j 3.0 access to neo4j-shell is no longer possible from the desktop-installers for Windows and OSX. To use neo4j-shell, you have to download the TAR/ZIP distribution from: http://neo4j.com/download/other-releases/ For importing files that contain semicolon separated cypher commands, you can… Read more →

A note on OPTIONAL MATCHes

An OPTIONAL MATCH matches patterns against your graph database, just like a MATCH does. The difference is that if no matches are found, OPTIONAL MATCH will use a null for missing parts of the pattern. OPTIONAL MATCH could be considered… Read more →

Comparing relationship properties within a path

You want to compare relationship-properties of a path, either by a global value or parameter, or with each other within a path. Basic model: (:Person {person_id:’123′})-[:FRIEND {date:’1972-01-01′}]->(m:Person {person_id:’456′}) Make sure to have an constraint OR index on your nodes: Constraint… Read more →

Understanding non-existent properties and working with nulls

In Neo4j, since there is no table schema or equivalent to restrict possible properties, non-existence and null are equivalent for node and relationship properties. That is, there really is no such thing as a property with a null value; null… Read more →

How to avoid using excessive memory on deletes involving dense nodes

In situations where you know you need to delete a bunch of nodes (and by rule their relationships as well), it can be tempting to simply use DETACH DELETE and be done with it. However, this can become problematic if… Read more →

Updating a node but returning its state from before the update

Some use cases require updating node (or relationship) properties, but returning the node (or relationship) as it was prior to the update. You’ll need to get a ‘snapshot’ of the node before the update, and return that snapshot instead of… Read more →

Neo4j: Convert string to date

Neo4j 3.4 saw the introduction of the temporal date type, and while there is now powerful in built functionality, converting strings to dates is still a challenge. If our string is in the format yyyy-MM-dd we can call the date… Read more →

Using Cypher to generate Cypher statements to recreate indexes and constraints

The following can be used to extract index defintions and constraint defintions from an existing database and the resultant output can be played back on another Neo4j database. For example with the :play movies dataset as when run from the… Read more →

How do I produce an inventory of statistics on nodes, relationships, properties

MATCH (n) WHERE rand() <= 0.1 WITH labels(n) as labels, size(keys(n)) as props, size((n)–()) as degree RETURN DISTINCT labels, count(*) AS NumofNodes, avg(props) AS AvgNumOfPropPerNode, min(props) AS MinNumPropPerNode, max(props) AS MaxNumPropPerNode, avg(degree) AS AvgNumOfRelationships, min(degree) AS MinNumOfRelationships, max(degree) AS MaxNumOfRelationships… Read more →

Tuning Cypher queries by understanding cardinality

Cardinality issues are the most frequent culprit in slow or incorrect Cypher queries. Because of this, understanding cardinality, and using this understanding to manage cardinality issues, is a critical component in Cypher query tuning, and query correctness in general. A… Read more →

Alternatives to UNION queries

While UNIONs can be useful for certain cases, they can often be avoided completely with small changes to the query. In this article we’ll present various example cases where a UNION isn’t necessary, and a simple Cypher query will do.… Read more →

Export a (sub)graph to Cypher script and import it again

Oftentimes you want to export a full (or partial) database to a file and import it again without copying the actual database files. If you want to do the latter, use neo4j-admin dump/load. Here are two ways on how to… Read more →

Achieving longestPath Using Cypher

While Cypher is optimized for finding the shortest path between two nodes, with such functionality as shortestPath(), it does not have the same sort of function for longest path. In some cases, you may want this, and not the shortest… Read more →

Importing Data to Neo4j in the Cloud

Loading data in a Neo4j instance that is in the cloud is very similar to running Neo4j using any other method. However, there are a few small things to look out for and keep in mind when importing data to… Read more →

Resetting query cardinality

As queries execute, they build up result rows. Cypher executes operations per-row. When a query is made up of completely separate parts, unrelated to each other, and you don’t want to split the single query into multiple queries, you sometimes… Read more →

Understanding the Query Plan Cache

When a Cypher statement is first submitted Neo4j will attempt to determine if the query is in the plan cache before planning it. By default Neo4j will keep 1000 query plans in cache based upon the conf/neo4j.conf parameter of dbms.query_cache_size.… Read more →

Post-UNION processing

Cypher does not allow further processing of UNION or UNION ALL results, since RETURN is required in all queries of the union. Here are some workarounds. Post-UNION processing in Neo4j 4.0 With Neo4j 4.0, post-UNION processing is now possible via… Read more →

How to write a Cypher query to return the top N results per category

The following Cypher describes how you can display the Top 5 test scores within an entire :Score population broken out by a field_of_study property. create (n:Score {student_id: ‘DC001’, score: 89, field_of_study: ‘chemistry’}); create (n:Score {student_id: ‘MK812’, score: 97, field_of_study: ‘chemistry’});… Read more →

Cross Product Cypher queries will not perform well

Just like SQL, if you do not properly connect the parts of your query, it will result in a cross (cartesian) product, which is seldom what you want. Take the following example: MATCH (p:Person), (m:Movie) RETURN p, m; In Cypher,… Read more →

All shortest paths between a set of nodes

Consider a number of arbitrary nodes, A,B,C,D,E,F,…​.. I wish to return all of the shortest paths between these nodes. The nodes may have many edges between them, but anticipate a maximum of 4. The graph is complex and non hierarchical… Read more →