6.5.9. Depth First Search

This section describes the Depth First Search traversal algorithm in the Neo4j Graph Data Science library.

This algorithm is in the alpha tier. For more information on algorithm tiers, see Chapter 6, Algorithms.

This topic includes:

6.5.9.1. Introduction

The Depth First Search algorithm is a graph traversal that starts at a given node and explores as far as possible along each branch before backtracking, see https://en.wikipedia.org/wiki/Depth-first_search. A related algorithm is the Breath First Search algorithm, Breath First Search. This algorithm can be preferred over Breath First Search for example if one wants to find a target node at a large distance and exploring a random path has decent probability of success. There are multiple termination conditions supported for the traversal, based on either reaching one of several target nodes, reaching a maximum depth, exhausting a given budget of traversed relationship cost, or just traversing the whole graph. The output of the procedure contains information about which nodes were visited and in what order.

6.5.9.2. Syntax

The following describes the API for running the algorithm and stream results: 

CALL gds.alpha.dfs.stream(
  graphName: String,
  configuration: Map
)
YIELD
  // general stream return columns
  startNodeId: Integer,
  nodeIds: Integer,
  path: Path

Table 6.419. Parameters
Name Type Default Optional Description

graphNameOrConfig

String or Map

null

no

Either the name of a graph stored in the catalog or a Map configuring the graph creation and algorithm execution.

configuration

Map

{}

yes

Configuration for algorithm-specifics and/or graph filtering. Must be empty if graphNameOrConfig is a Map.

Table 6.420. General configuration
Name Type Default Optional Description

concurrency

Integer

4

yes

The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency' and 'writeConcurrency'.

readConcurrency

Integer

value of 'concurrency'

yes

The number of concurrent threads used for reading the graph.

writeConcurrency

Integer

value of 'concurrency'

yes

The number of concurrent threads used for writing the result (applicable in WRITE mode).

nodeProjection

Map or List

null

yes

The node projection used for implicit graph loading or filtering nodes of an explicitly loaded graph.

relationshipProjection

Map or List

null

yes

The relationship projection used for implicit graph loading or filtering relationship of an explicitly loaded graph.

nodeQuery

String

null

yes

The Cypher query used to select the nodes for implicit graph loading via a Cypher projection.

relationshipQuery

String

null

yes

The Cypher query used to select the relationships for implicit graph loading via a Cypher projection.

nodeProperties

Map or List

null

yes

The node properties to load during implicit graph loading.

relationshipProperties

Map or List

null

yes

The relationship properties to load during implicit graph loading.

Table 6.421. Algorithm specific configuration
Name Type Default Optional Description

startNodeId

Integer

n/a

no

The node id of the node where to start the traversal.

targetNodes

Integer[]

empty list

yes

Ids for target nodes. Traversal terminates when any target node is visited.

maxDepth

Integer

-1

yes

The maximum distance from the start node at which nodes are visited.

Table 6.422. Results
Name Type Description

startNodeId

Integer

The node id of the node where to start the traversal.

nodeIds

Integer[]

The ids of all nodes that were visited during the traversal.

path

Path

A path containing all the nodes that were visited during the traversal.

6.5.9.3. Examples

Consider the graph created by the following Cypher statement:

CREATE
       (nA:Node {tag: 'a'}),
       (nB:Node {tag: 'b'}),
       (nC:Node {tag: 'c'}),
       (nD:Node {tag: 'd'}),
       (nE:Node {tag: 'e'}),

       (nA)-[:REL {cost: 8.0}]->(nB),
       (nA)-[:REL {cost: 9.0}]->(nC),
       (nB)-[:REL {cost: 1.0}]->(nE),
       (nC)-[:REL {cost: 5.0}]->(nD)

The following statement will create the graph and store it in the graph catalog. 

CALL gds.graph.create('myGraph', 'Node', 'REL', { relationshipProperties: 'cost' })

In the following examples we will demonstrate using the Depth First Search algorithm on this graph. If we do not specify any of the options for early termination, the whole graph is visited:

Running the Depth First Search algorithm: 

MATCH (a:Node{tag:'a'})
WITH id(a) AS startNode
CALL gds.alpha.dfs.stream('myGraph', {startNode: startNode})
YIELD path
UNWIND [ n in nodes(path) | n.tag ] AS tags
RETURN tags
ORDER BY tags

Table 6.423. Results
tags

"a"

"b"

"c"

"d"

"e"

If specifying d and e as target nodes, not all nodes at distance 1 will be visited due to the depth first traversal order, in which node d is reached before b:

Running the Depth First Search algorithm with target nodes: 

MATCH (a:Node{tag:'a'}), (d:Node{tag:'d'}), (e:Node{tag:'e'})
WITH id(a) AS startNode, [id(d), id(e)] AS targetNodes
CALL gds.alpha.dfs.stream('myGraph', {startNode: startNode, targetNodes: targetNodes})
YIELD path
UNWIND [ n in nodes(path) | n.tag ] AS tags
RETURN tags
ORDER BY tags

Table 6.424. Results
tags

"a"

"c"

"d"

Running the Depth First Search algorithm with maxDepth: 

MATCH (a:Node{tag:'a'})
WITH id(a) AS startNode
CALL gds.alpha.dfs.stream('myGraph', {startNode: startNode, maxDepth: 1})
YIELD path
UNWIND [ n in nodes(path) | n.tag ] AS tags
RETURN tags
ORDER BY tags

Table 6.425. Results
tags

"a"

"b"

"c"

In the above case, nodes d and e were not visited since they are at distance 2 from a.