Cypher path matching

Cypher® path matching uses relationship isomorphism, the same relationship cannot be returned more than once in the same result record.

Neo4j Cypher makes use of relationship isomorphism for path matching, which is a very effective way of reducing the result set size and preventing infinite traversals.

In Neo4j, all relationships have a direction. However, you can have the notion of undirected relationships at query time.

In the case of variable length pattern expressions, it is particularly important to have a constraint check, or an infinite number of result records could be found.

To understand this better, let us consider a few alternative options:

Homomorphism

No constraints for path matching.

Node isomorphism

The same node cannot be returned more than once for each path matching record.

Relationship isomorphism

The same relationship cannot be returned more than once for each path matching record. Cypher makes use of relationship isomorphism for path matching.

Homomorphism

Constraints: No constraints for path matching.

Example 1. Homomorphism

The graph is composed of only two nodes (a) and (b), connected by one relationship, (a:Node)-[r:R]->(b:Node).

If the query is looking for paths of length n and do not care about the direction, a path of length n will be returned repeating the two nodes over and over.

For example, find all paths with 5 relationships and do not care about the relationship direction:

MATCH p = ()-[*5]-()
RETURN nodes(p)

This will return the two resulting records if homomorphism was used, [a,b,a,b,a,b], as well as [b,a,b,a,b,a].

Node isomorphism

Constraints: The same node cannot be returned more than once for each path matching record.

In another two-node example, such as (a:Node)-[r:R]->(b:Node); only paths of length 1 can be found with the node isomorphism constraint.

Example 2. Node isomorphism

The graph is composed of only two nodes (a) and (b), connected by one relationship, (a:Node)-[r:R]->(b:Node).

MATCH p = ()-[*1]-()
RETURN nodes(p)

This will return the two resulting records if node isomorphism was used, [a, b], as well as [b, a].

Relationship isomorphism

Constraints: The same relationship cannot be returned more than once for each path matching record.

In another two-node example, such as (a:Node)-[r:R]->(b:Node); only paths of length 1 can be found with the relationship isomorphism constraint.

Example 3. Relationship isomorphism

The graph is composed of only two nodes (a) and (b), connected by one relationship, (a:Node)-[r:R]->(b:Node).

MATCH p = ()-[*1]-()
RETURN nodes(p)

This will return the two resulting records [a, b], as well as [b, a].

Cypher path matching example

Cypher makes use of relationship isomorphism for path matching.

Example 4. Friend of friends

Looking for a user’s friends of friends should not return said user.

To demonstrate this, let’s create a few nodes and relationships:

Query 1, create data.

CREATE
  (adam:User {name: 'Adam'}),
  (pernilla:User {name: 'Pernilla'}),
  (david:User {name: 'David'}),
  (adam)-[:FRIEND]->(pernilla),
  (pernilla)-[:FRIEND]->(david)
Nodes created: 3
Relationships created: 2
Properties set: 3

Which gives us the following graph:

graph2

Now let’s look for friends of friends of Adam:

Query 2, friend of friends of Adam.

MATCH (user:User {name: 'Adam'})-[r1:FRIEND]-()-[r2:FRIEND]-(friend_of_a_friend)
RETURN friend_of_a_friend.name AS fofName
Rows: 1

+---------+
| fofName |
+---------+
| "David" |
+---------+

In this query, Cypher makes sure to not return matches where the pattern relationships r1 and r2 point to the same graph relationship.

This is however not always desired. If the query should return the user, it is possible to spread the matching over multiple MATCH clauses, like so:

Query 3, multiple MATCH clauses.

MATCH (user:User {name: 'Adam'})-[r1:FRIEND]-(friend)
MATCH (friend)-[r2:FRIEND]-(friend_of_a_friend)
RETURN friend_of_a_friend.name AS fofName
Rows: 2

+---------+
| fofName |
+---------+
| "David" |
| "Adam"  |
+---------+

Note that while the following Query 4 looks similar to Query 3, it is actually equivalent to Query 2.

Query 4, equivalent to query 2.

MATCH
  (user:User {name: 'Adam'})-[r1:FRIEND]-(friend),
  (friend)-[r2:FRIEND]-(friend_of_a_friend)
RETURN friend_of_a_friend.name AS fofName

Here, the MATCH clause has a single pattern with two paths, while the previous query has two distinct patterns.

Rows: 1

+---------+
| fofName |
+---------+
| "David" |
+---------+