Expand to subgraph

The expand to subgraph procedure expands to subgraph NODE values reachable from the start NODE following RELATIONSHIP values to max-level adhering to the label filters. Returns the collection of NODE values in the subgraph, and the collection of RELATIONSHIP values between all subgraph NODE values. It allows fine-grained control over the traversals that expand these subgraphs.

Procedure Overview

The procedure is described below:

Qualified Name Type

apoc.path.subgraphAll
apoc.path.subgraphAll(startNode ANY, config MAP<STRING, ANY>) - returns the sub-graph reachable from the start NODE following the given RELATIONSHIP types to max-depth.

Procedure

Configuration parameters

The procedures support the following config parameters:

Table 1. Config parameters
name type default description

minLevel

INTEGER

-1

the minimum number of hops in the traversal. Must be 0 or 1 if specified

maxLevel

INTEGER

-1

the maximum number of hops in the traversal

relationshipFilter

STRING

null

the relationship types and directions to traverse.

See Relationship Filters.

labelFilter

STRING

null

the node labels to traverse.

See Label Filters.

beginSequenceAtStart

BOOLEAN

true

starts matching sequences of node labels and/or relationship types (defined in relationshipFilter, labelFilter, or sequences) one node away from the start node.

bfs

BOOLEAN

true

use Breadth First Search when traversing. Uses Depth First Search if set to false

filterStartNode

BOOLEAN

false

whether the labelFilter and sequence apply to the start node of the expansion.

limit

INTEGER

-1

limit the number of paths returned. When using bfs:true, this has the effect of returning paths to the n nearest nodes with labels in the termination or end node filter, where n is the limit given. If set to true, a null value is yielded whenever the expansion would normally eliminate rows due to no results.

endNodes

LIST<NODE>

null

only these nodes can end returned paths, and expansion will continue past these nodes, if possible.

terminatorNodes

LIST<NODE>

null

Only these nodes can end returned paths, and expansion won’t continue past these nodes.

allowlistNodes

LIST<NODE>

null

Only these nodes are allowed in the expansion (though endNodes and terminatorNodes will also be allowed, if present).

denylistNodes

LIST<NODE>

null

None of the paths returned will include these nodes.

whitelistNodes (deprecated)

LIST<NODE>

null

See allowlistNodes.

blacklistNodes (deprecated)

LIST<NODE>

null

See denylistNodes.

It also has the following fixed parameter:

Table 2. Config parameters
name type default description

uniqueness

STRING

NODE_GLOBAL

the strategy to use when expanding relationships in a traversal. NODE_GLOBAL means that a node cannot be traversed more than once. This is what the legacy traversal framework does.

Relationship Filters

The syntax for relationship filters is described below:

Syntax: [<]RELATIONSHIP_TYPE1[>]|[<]RELATIONSHIP_TYPE2[>]|…​

input type direction

LIKES>

LIKES

OUTGOING

<FOLLOWS

FOLLOWS

INCOMING

KNOWS

KNOWS

BOTH

>

any type

OUTGOING

<

any type

INCOMING

Label Filters

The syntax for label filters is described below:

Syntax: [+-/>]LABEL1|LABEL2|*|…​

Symbol Filter Type Input Example Description

-

Denylist

-Foe

No node in the path will have a label that is present in the denylist.

+

Allowlist

+Friend

All nodes in the path must have a label in the allowlist (exempting termination and end nodes, if using those filters). If no allowlist operator is present, all labels are allowed.

/

Termination

/Friend

Only return paths up to a node with the given labels, and stop further expansion beyond it. Termination nodes do not have to respect the allowlist. Termination filtering takes precedence over end node filtering.

>

End node

>Friend

Only return paths up to a node with the given labels, but continue expansion to match end nodes beyond it. End nodes do not have to respect the allowlist to be returned, but expansion beyond them is only allowed if the node has a label in the allowlist.

:

Compound label

Foe:Friend

This returns a conjunction of labels, e.g /Foo:Bar means the termination node has to match both Foo and Bar. To include : in the label that do not have a special meaning, escape them using a \ e.g Foo\:Bar is the label Foo:Bar.

Label filter operator precedence and behavior

Multiple label filter operators are allowed at the same time. Take the following example:

labelFilter:'+Person|Movie|-SciFi|>Western|/Romance'

If we work through this label filter, we can see that:

  • :Person and :Movie labels are allowlisted

  • :SciFi is denylisted

  • :Western is an end node label

  • :Romance is as a termination label.

The precedence of operator evaluation isn’t dependent upon their location in the labelFilter but is fixed:

Denylist filter -, termination filter /, end node filter >, allowlist filter +.

This means:

  • No denylisted label - will ever be present in the nodes of paths returned, even if the same label (or another label of a node with a denylisted label) is included in another filter list.

  • If the termination filter / or end node filter > is used, then only paths up to nodes with those labels will be returned as results. These end nodes are exempt from the allowlist filter.

  • If a node is a termination node /, no further expansion beyond the node will occur.

  • The allowlist only applies to nodes up to but not including end nodes from the termination or end node filters. If no end node or termination node operators are present, then the allowlist applies to all nodes of the path.

  • If no allowlist operators are present in the labelFilter, this is treated as if all labels are allowlisted.

Examples

The examples in this section are based on the following sample graph:

MERGE (mark:Person:DevRel {name: "Mark"})
MERGE (lju:Person:DevRel {name: "Lju"})
MERGE (praveena:Person:Engineering {name: "Praveena"})
MERGE (zhen:Person:Engineering {name: "Zhen"})
MERGE (martin:Person:Engineering {name: "Martin"})
MERGE (joe:Person:Field {name: "Joe"})
MERGE (stefan:Person:Field {name: "Stefan"})
MERGE (alicia:Person:Product {name: "Alicia"})
MERGE (jake:Person:Product {name: "Jake"})
MERGE (john:Person:Product {name: "John"})
MERGE (jonny:Person:Sales {name: "Jonny"})
MERGE (anthony:Person:Sales {name: "Anthony"})
MERGE (rik:Person:Sales {name: "Rik"})

MERGE (zhen)-[:KNOWS]-(stefan)
MERGE (zhen)-[:KNOWS]-(lju)
MERGE (zhen)-[:KNOWS]-(praveena)
MERGE (zhen)-[:KNOWS]-(martin)
MERGE (mark)-[:KNOWS]-(jake)
MERGE (alicia)-[:KNOWS]-(jake)
MERGE (jonny)-[:KNOWS]-(anthony)
MERGE (john)-[:KNOWS]-(rik)

MERGE (alicia)-[:FOLLOWS]->(joe)
MERGE (joe)-[:FOLLOWS]->(mark)
MERGE (joe)-[:FOLLOWS]->(praveena)
MERGE (joe)-[:FOLLOWS]->(zhen)
MERGE (mark)-[:FOLLOWS]->(stefan)
MERGE (stefan)-[:FOLLOWS]->(joe)
MERGE (praveena)-[:FOLLOWS]->(joe)
MERGE (lju)-[:FOLLOWS]->(jake)
MERGE (alicia)-[:FOLLOWS]->(jonny)
MERGE (zhen)-[:FOLLOWS]->(john)
MERGE (anthony)-[:FOLLOWS]->(joe)

The Neo4j Browser visualization below shows the sample graph:

apoc.path.expandConfig
Figure 1. Sample Graph

The KNOWS relationship type is considered to be bidirectional, where if Zhen knows Stefan, we can imply that Stefan knows Zhen. When using the KNOWS relationship we will ignore the direction.

The FOLLOWS relationship has a direction, so we will specify a direction when we use it.

Relationship Type and Node Label filters

Let’s start by expanding paths from the Praveena node. We only want to consider the KNOWS relationship type, so we’ll specify that as the relationshipFilter parameter.

The following returns the subgraph reachable by the KNOWS relationship at 1 to 2 hops from Praveena
MATCH (p:Person {name: "Praveena"})
CALL apoc.path.subgraphAll(p, {
	relationshipFilter: "KNOWS",
    minLevel: 1,
    maxLevel: 2
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Praveena.

apoc.path.subgraphAll.praveena
Figure 2. Subgraph from Praveena

We can also provide a node label filter to restrict the nodes that are returned. If we want to only return paths where every node has the Engineering label, we’ll provide the value +Engineering to the labelFilter parameter.

The following returns the subgraph o Engineering people reachable by the KNOWS relationship at 1 to 2 hops from Praveena
MATCH (p:Person {name: "Praveena"})
CALL apoc.path.subgraphAll(p, {
	relationshipFilter: "KNOWS",
	labelFilter: "+Engineering",
    minLevel: 1,
    maxLevel: 2
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph of Engineering nodes from Praveena.

apoc.path.subgraphAll.praveena engineering
Figure 3. Subgraph of Engineering nodes from Praveena

We lose Lju and Stefan because those nodes don’t have the Engineering label.

We can specify multiple relationship types. The following query starts from the Alicia node, and then expands the FOLLOWS and KNOWS relationships:

The following returns the subgraph of people reachable by the FOLLOWS or KNOWS relationships at 1 to 3 hops from Alicia
MATCH (p:Person {name: "Alicia"})
CALL apoc.path.subgraphAll(p, {
    relationshipFilter: "FOLLOWS>|KNOWS",
    minLevel: 1,
    maxLevel: 3
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Alicia.

apoc.path.subgraphAll.alicia
Figure 4. Subgraph from Alicia

This subgraph includes all but one of the people in our graph, which means that Alicia is very well connected.

We can also specify traversal termination criteria using label filters. If we wanted to terminate a traversal as soon as the traversal encounters a node containing the Engineering label, we can use the /Engineering node filter.

The following returns the subgraph reachable by the FOLLOWS or KNOWS relationships at 1 to 3 hops from Alicia, terminating as soon as a node with the Engineering label is reached
MATCH (p:Person {name: "Alicia"})
CALL apoc.path.subgraphAll(p, {
    relationshipFilter: "FOLLOWS>|KNOWS",
    labelFilter: "/Engineering",
    minLevel: 1,
    maxLevel: 3
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Alicia terminating at Engineering nodes.

apoc.path.subgraphAll.alicia engineering
Figure 5. Subgraph from Alicia terminating at Engineering nodes

We’re now down to only 2 people - Zhen and Praveena. But this query doesn’t capture all of the paths from Alicia that end in a node with the Engineering label. We can use the >Engineering node filter to define a traversal that:

  • only returns paths that terminate at nodes with the Engineering label

  • continues expansion to end nodes after that, looking for more paths that end with the Engineering label

The following returns the subgraph of Engineering people reachable by the FOLLOWS or KNOWS relationships at 1 to 3 hops from Alicia
MATCH (p:Person {name: "Alicia"})
CALL apoc.path.subgraphAll(p, {
    relationshipFilter: "FOLLOWS>|KNOWS",
    labelFilter: ">Engineering",
    minLevel: 1,
    maxLevel: 3
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Alicia ending at Engineering nodes.

apoc.path.subgraphAll.alicia engineering end
Figure 6. Subgraph from Alicia ending at Engineering nodes

Our subgraph now also includes Martin, who is reached via a relationship from Zhen.

Terminator Nodes and End Nodes

As well as specifying terminator and end labels for traversals, we can also specify terminator and end nodes.

Let’s build on the previous query that found people that Alicia KNOWS or FOLLOWS. We want the returned subgraph to stop as soon as the Mark, Joe, Zhen, or Praveena nodes are reached. We can do that by passing those nodes to the terminatorNodes parameter.

The following returns the subgraph of people that Alicia FOLLOWS or KNOWS from 1 to 3 hops, terminating as soon as Mark, Joe, Zhen, or Rik nodes are reached
MATCH (p:Person {name: "Alicia"})
MATCH (terminator:Person)
WHERE terminator.name IN ["Mark", "Joe", "Zhen", "Rik"]
WITH p, collect(terminator) AS terminatorNodes
CALL apoc.path.subgraphAll(p, {
    relationshipFilter: "FOLLOWS>|KNOWS",
    minLevel: 1,
    maxLevel: 3,
    terminatorNodes: terminatorNodes
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Alicia terminating at Mark, Joe, Zhen, or Rik.

apoc.path.subgraphAll.alicia terminator
Figure 7. Subgraph from Alicia terminating at Mark, Joe, Zhen, or Rik

We have paths to Mark and Joe, but Zhen and Rik can’t be reached This could be because there is no path to Zhen and Rik that doesn’t go through Mark and Joe, or it could mean that there’s no path based on the other traversal criteria.

We can find out whether Mark, Joe, Zhen, or Rik are reachable by passing these nodes to the endNodes parameter.

The following returns the subgraph of people that Alicia FOLLOWS or KNOWS from 1 to 3 hops, ending as soon as Mark, Joe, Zhen, or Rik nodes are reached
MATCH (p:Person {name: "Alicia"})
MATCH (end:Person)
WHERE end.name IN ["Mark", "Joe", "Zhen", "Rik"]
WITH p, collect(end) AS endNodes
CALL apoc.path.subgraphAll(p, {
    relationshipFilter: "FOLLOWS>|KNOWS",
    minLevel: 1,
    maxLevel: 3,
    endNodes: endNodes
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Alicia ending at Mark, Joe, Zhen, or Rik.

apoc.path.subgraphAll.alicia end
Figure 8. Subgraph from Alicia ending at Mark, Joe, Zhen, or Rik

We can now reach Joe, Mark, and Zhen, but Rik is still unreachable.

Allowlist Nodes and Denylist Nodes

Allowlist and denylist nodes can also be specified.

Let’s build on the previous query that found people that Alicia KNOWS or FOLLOWS. We want any returned paths to only include the nodes Mark, Joe, Zhen, and Praveena, which we can do by passing these nodes to the parameter allowlistNodes.

The following returns nodes reachable by the FOLLOWS or KNOWS relationship types at 1 to 3 hops from Alicia, where the paths to those nodes must only include Mark, Jonny, or Zhen
MATCH (p:Person {name: "Alicia"})
MATCH (allowlist:Person)
WHERE allowlist.name IN ["Jonny", "Mark", "Zhen"]
WITH p, collect(allowlist) AS allowlistNodes
CALL apoc.path.subgraphAll(p, {
    relationshipFilter: "FOLLOWS>|KNOWS",
    minLevel: 1,
    maxLevel: 3,
    allowlistNodes: allowlistNodes
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Alicia where paths to nodes include Mark, Jonny, or Zhen.

apoc.path.subgraphAll.alicia allowlist
Figure 9. Subgraph from Alicia where paths to nodes include Mark, Jonny, or Zhen

Only Jonny can be reached. We can therefore infer that Mark and Zhen are only reachable via another node that wasn’t include in the allowlist.

A denylist is used to exclude nodes from the paths that lead to reachable nodes. If we want to return nodes that are reachable without going through Joe, we can do this by passing the Joe node to the denylistNodes parameter.

The following returns nodes reachable by the FOLLOWS or KNOWS relationship types at 1 to 3 hops from Alicia, where the paths to those nodes do not go through Joe
MATCH (p:Person {name: "Alicia"})
MATCH (joe:Person {name: "Joe"})
CALL apoc.path.subgraphAll(p, {
    relationshipFilter: "FOLLOWS>|KNOWS",
    minLevel: 1,
    maxLevel: 3,
    denylistNodes: [joe]
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Alicia where paths to nodes can’t go via Joe.

apoc.path.subgraphAll.alicia denylist joe
Figure 10. Subgraph from Alicia where paths to nodes can’t go via Joe

Sequences of relationship types

Sequences of relationship types can be specified by comma separating the values passed to relationshipFilter.

For example, if we want to start from the Joe node and traverse a sequence of the FOLLOWS relationship in the outgoing direction and the KNOWS relationship in either direction, we can specify the relationship filter FOLLOWS>,KNOWS.

The following returns the reachable nodes by following the FOLLOWS and KNOWS relationship types alternately from Joe
MATCH (p:Person {name: "Joe"})
CALL apoc.path.subgraphAll(p, {
	relationshipFilter: "FOLLOWS>,KNOWS",
	beginSequenceAtStart: true,
	minLevel: 1,
	maxLevel: 4
})
YIELD nodes, relationships
RETURN nodes, relationships;

We can see a Neo4j Browser visualization of the returned subgraph in Subgraph from Joe via alternate FOLLOWS and KNOWS relationship types.

apoc.path.subgraphAll.joe sequence
Figure 11. Subgraph from Joe via alternate FOLLOWS and KNOWS relationship types