5.5.8. Breadth First Search

This section describes the Breadth First Search traversal algorithm in the Neo4j Graph Data Science library.

This algorithm is in the alpha tier. For more information on algorithm tiers, see Chapter 5, Algorithms.

This topic includes:

5.5.8.1. Introduction

The Breadth First Search algorithm is a graph traversal algorithm that given a start node visits nodes in order of increasing distance, see https://en.wikipedia.org/wiki/Breadth-first_search. A related algorithm is the Depth First Search algorithm, Depth First Search. This algorithm is useful for searching when the likelihood of finding the node searched for decreases with distance. There are multiple termination conditions supported for the traversal, based on either reaching one of several target nodes, reaching a maximum depth, exhausting a given budget of traversed relationship cost, or just traversing the whole graph. The output of the procedure contains information about which nodes were visited and in what order.

5.5.8.2. Syntax

The following describes the API for running the algorithm and stream results: 

CALL gds.alpha.bfs.stream(
  graphName: string,
  configuration: map
)
YIELD
  // general stream return columns
  startNodeId: int,
  nodeIds: int,
  path: Path

Table 5.388. Parameters
Name Type Default Optional Description

graphNameOrConfig

String or Map

null

no

Either the name of a graph stored in the catalog or a Map configuring the graph creation and algorithm execution.

configuration

Map

{}

yes

Configuration for algorithm-specifics and/or graph filtering. Must be empty if graphNameOrConfig is a Map.

Table 5.389. General configuration
Name Type Default Optional Description

concurrency

Integer

4

yes

The number of concurrent threads used for running the algorithm. Also provides the default value for 'readConcurrency' and 'writeConcurrency'.

readConcurrency

Integer

value of 'concurrency'

yes

The number of concurrent threads used for reading the graph.

writeConcurrency

Integer

value of 'concurrency'

yes

The number of concurrent threads used for writing the result (applicable in WRITE mode).

nodeProjection

Map or List

null

yes

The node projection used for implicit graph loading or filtering nodes of an explicitly loaded graph.

relationshipProjection

Map or List

null

yes

The relationship projection used for implicit graph loading or filtering relationship of an explicitly loaded graph.

nodeQuery

String

null

yes

The Cypher query used to select the nodes for implicit graph loading via a Cypher projection.

relationshipQuery

String

null

yes

The Cypher query used to select the relationships for implicit graph loading via a Cypher projection.

nodeProperties

Map or List

null

yes

The node properties to load during implicit graph loading.

relationshipProperties

Map or List

null

yes

The relationship properties to load during implicit graph loading.

Table 5.390. Algorithm specific configuration
Name Type Default Optional Description

startNodeId

Integer

n/a

no

The node id of the node where to start the traversal.

targetNodes

Integer[]

empty list

yes

Ids for target nodes. Traversal terminates when any target node is visited.

maxDepth

Integer

-1

yes

The maximum distance from the start node at which nodes are visited.

maxCost

Integer

NaN

yes

The maximum accumulated cost of any path from start node to a node that should be visited.

Table 5.391. Results
Name Type Description

startNodeId

Integer

The node id of the node where to start the traversal.

nodeIds

Integer[]

The ids of all nodes that were visited during the traversal.

path

Path

A path containing all the nodes that were visited during the traversal.

5.5.8.3. Examples

Consider the graph created by the following Cypher statement:

CREATE
       (nA:Node {tag: 'a'}),
       (nB:Node {tag: 'b'}),
       (nC:Node {tag: 'c'}),
       (nD:Node {tag: 'd'}),
       (nE:Node {tag: 'e'}),

       (nA)-[:REL {cost: 8.0}]->(nB),
       (nA)-[:REL {cost: 9.0}]->(nC),
       (nB)-[:REL {cost: 1.0}]->(nE),
       (nC)-[:REL {cost: 5.0}]->(nD)

The following statement will create the graph and store it in the graph catalog. 

CALL gds.graph.create('myGraph', 'Node', 'REL', { relationshipProperties: 'cost' })

In the following examples we will demonstrate using the Breadth First Search algorithm on this graph.

Running the Breadth First Search algorithm: 

MATCH (a:Node{tag:'a'})
WITH id(a) AS startNode
CALL gds.alpha.bfs.stream('myGraph', {startNode: startNode})
YIELD path
UNWIND [ n in nodes(path) | n.tag ] AS tags
RETURN tags
ORDER BY tags

Table 5.392. Results
tags

"a"

"b"

"c"

"d"

"e"

Since none of the options for early termination are specified, the whole graph is visited during the traversal.

Running the Breadth First Search algorithm with target nodes: 

MATCH (a:Node{tag:'a'}), (d:Node{tag:'d'}), (e:Node{tag:'e'})
WITH id(a) AS startNode, [id(d), id(e)] AS targetNodes
CALL gds.alpha.bfs.stream('myGraph', {startNode: startNode, targetNodes: targetNodes})
YIELD path
UNWIND [ n in nodes(path) | n.tag ] AS tags
RETURN tags
ORDER BY tags

Table 5.393. Results
tags

"a"

"b"

"c"

"e"

Running the Breadth First Search algorithm with maxDepth: 

MATCH (a:Node{tag:'a'})
WITH id(a) AS startNode
CALL gds.alpha.bfs.stream('myGraph', {startNode: startNode, maxDepth: 1})
YIELD path
UNWIND [ n in nodes(path) | n.tag ] AS tags
RETURN tags
ORDER BY tags

Table 5.394. Results
tags

"a"

"b"

"c"

In the above example, nodes d and e were not visited since they are at distance 2 from a.

Running the Breadth First Search algorithm with maxCost: 

MATCH (a:Node{tag:'a'})
WITH id(a) AS startNode
CALL gds.alpha.bfs.stream('myGraph', {startNode: startNode, maxCost: 10, relationshipWeightProperty: 'cost'})
YIELD path
UNWIND [ n in nodes(path) | n.tag ] AS tags
RETURN tags
ORDER BY tags

Table 5.395. Results
tags

"a"

"b"

"c"

"e"

Due to the max cost limit of 10, node d cannot be reached since the total cost of the path from a to d is 14.