Breadth First Search
Introduction
The Breadth First Search (BFS) algorithm starts from a given node and visits nodes in order of increasing distance (number of hops) from the start, see https://en.wikipedia.org/wiki/Breadth-first_search. BFS is useful when the likelihood of finding relevant nodes decreases with distance.
There are multiple termination conditions supported for the traversal, based on either reaching one of several target nodes, reaching a maximum depth, or just traversing the whole graph.
The output of a BFS job is written to a Snowflake table.
The BFS path consists of individual links source → target
where source
and target
are consecutive nodes on the BFS path.
The output table contains precisely these individual links which together form the BFS path.
Since the BFS path may traverse nodes of various node labels, the labels of both nodes in the links are included in the output table.
Name | Type | Description |
---|---|---|
SOURCENODEID |
|
The id of the 'first' node of a link in the BFS path. Numeric ids are converted to string. |
TARGETNODEID |
|
The id of the 'second' node of a link in the BFS path. Numeric ids are converted to string. |
SOURCELABEL |
|
The name of the node label of the 'first' node of the link. |
TARGETLABEL |
|
The name of the node label of the 'second' node of the link. |
We denote such relationship tables as heterogeneous (over node labels).
The order of rows in the output table is not guaranteed to match the BFS visitation order. Also, the links of the BFS exploration are not necessarily relationships in the original graph, due to back-tracking 'jumps'. |
Syntax
This section covers the syntax used to execute the Breadth First Search algorithm.
CALL Neo4j_Graph_Analytics.graph.bfs(
'CPU_X64_XS', (1)
{
['defaultTablePrefix': '...',] (2)
'project': {...}, (3)
'compute': {...}, (4)
'write': {...} (5)
}
);
1 | Compute pool selector. |
2 | Optional prefix for table references. |
3 | Project config. |
4 | Compute config. |
5 | Write config. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
computePoolSelector |
String |
|
no |
The selector for the compute pool on which to run the BFS job. |
configuration |
Map |
|
no |
Configuration for graph project, algorithm compute and result write back. |
For more details on below Project configuration, refer to the Project documentation. |
Name | Type |
---|---|
nodeTables |
List of node tables. |
relationshipTables |
Map of relationship types to relationship tables. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
sourceNode |
String or Integer |
|
no |
The node id of the node where to start the traversal. |
sourceNodeTable |
String |
|
no |
Fully-qualified node table containing the |
targetNodes |
List<String or Integer> |
|
yes |
Set of target node identifiers; traversal stops when any is reached. |
targetNodesTable |
String |
|
depends on |
Fully-qualified node table containing the |
maxDepth |
Integer |
|
yes |
The maximum distance from the source node at which nodes are visited. Value of -1 means search continues to arbitrary distance. |
For more details on below Write configuration, refer to the Write documentation. |
Name | Type | Default | Optional | Description |
---|---|---|---|---|
outputTable |
String |
|
no |
Table in Snowflake database to which relationships are written. |
Examples
Now we will look at how to apply BFS to a small directed example graph. The graph is depicted like so:

The queries below create the example graph.
-- CREATE DATABASE EXAMPLE_DB; -- to do once
USE DATABASE EXAMPLE_DB;
-- CREATE SCHEMA DATA_SCHEMA; -- to do once
USE SCHEMA DATA_SCHEMA;
CREATE OR REPLACE TABLE nodes_A (nodeId varchar);
INSERT INTO nodes_A (nodeId)
SELECT 'a1' UNION ALL
SELECT 'a2';
CREATE OR REPLACE TABLE nodes_B (nodeId Number);
INSERT INTO nodes_B (nodeId)
SELECT 1 UNION ALL
SELECT 7;
CREATE OR REPLACE TABLE nodes_C (nodeId Number);
INSERT INTO nodes_C (nodeId)
SELECT 1 UNION ALL
SELECT 2;
CREATE OR REPLACE TABLE relationships_R (sourceNodeId varchar, targetNodeId Number);
INSERT INTO relationships_R VALUES ('a1', 7), ('a2', 1);
CREATE OR REPLACE TABLE relationships_S (sourceNodeId varchar, targetNodeId Number);
INSERT INTO relationships_S VALUES ('a1', 1);
CREATE OR REPLACE TABLE relationships_T (sourceNodeId Number, targetNodeId varchar);
INSERT INTO relationships_T VALUES (7, 'a2');
CREATE OR REPLACE TABLE relationships_U (sourceNodeId Number, targetNodeId Number);
INSERT INTO relationships_U VALUES (1, 2);
Run job (full traversal from a source)
Running a Breadth First Search job involves the three steps: Project, Compute and Write.
To run the query, there is a required setup of grants for the application, your consumer role and your environment. Please see the Getting started page for more on this.
We also assume that the application name is the default Neo4j_Graph_Analytics. If you chose a different app name during installation, please replace it with that.
CALL Neo4j_Graph_Analytics.graph.bfs('CPU_X64_XS', {
'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
'project': {
'nodeTables': ['nodes_A', 'nodes_B', 'nodes_C'],
'relationshipTables': {
'relationships_R': {
'sourceTable': 'nodes_A',
'targetTable': 'nodes_B'
},
'relationships_S': {
'sourceTable': 'nodes_A',
'targetTable': 'nodes_C'
},
'relationships_T': {
'sourceTable': 'nodes_B',
'targetTable': 'nodes_A'
},
'relationships_U': {
'sourceTable': 'nodes_B',
'targetTable': 'nodes_C'
}
}
},
'compute': {
'sourceNodeTable': 'nodes_A',
'sourceNode': 'a1',
'targetNodesTable': 'nodes_C',
'targetNodes': [2], -- will not be reached due to maxDepth
'maxDepth': 2
},
'write': [{'outputTable': 'BFS_NEXT'}]
})
JOB_ID | JOB_STATUS | JOB_START | JOB_END | JOB_RESULT |
---|---|---|---|---|
job_8a3f2e0b1c4248d3a0f3a9f0e3f1abcd |
SUCCESS |
2025-06-30 11:05:12.210 |
2025-06-30 11:05:16.845 |
{ "bfs_1": { "computeMillis": 12, "configuration": { "concurrency": 6, "jobId": "a41ac8f8-acf0-4d8d-a7d4-f6b80786a6fe", "logProgress": true, "maxDepth": 2, "mutateRelationshipType": "NEXT", "nodeLabels": [ "*" ], "relationshipTypes": [ "*" ], "sourceNode": "a1", "sourceNodeTable": "EXAMPLE_DB.DATA_SCHEMA.nodes_A", "sudo": false, "targetNodes": [ 2 ], "targetNodesTable": "EXAMPLE_DB.DATA_SCHEMA.nodes_C" }, "mutateMillis": 2, "postProcessingMillis": 0, "preProcessingMillis": 8, "relationshipsWritten": 3 }, "project_1": { "graphName": "snowgraph", "nodeCount": 6, "nodeMillis": 308, "relationshipCount": 5, "relationshipMillis": 572, "totalMillis": 880 }, "write_relationship_type_1": { "exportMillis": 2162, "outputTable": "EXAMPLE_DB.DATA_SCHEMA.BFS_NEXT", "relationshipProperty": "<none>", "relationshipType": "NEXT", "relationshipsExported": 3 } } |
We can now query the written results as follows.
SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.BFS_NEXT;
SOURCENODEID | TARGETNODEID | SOURCELABEL | TARGETLABEL |
---|---|---|---|
a1 |
1 |
nodes_A |
nodes_C |
7 |
a2 |
nodes_B |
nodes_A |
1 |
7 |
nodes_C |
nodes_B |
These rows together represent the BFS visitation order ('a1': nodes_A) → (1: nodes_C) → (7: nodes_B) → ('a2': nodes_A)
.
The path contains all nodes at depth 2 or less, because the only target node (2: nodes_C)
has greater depth and would only be reached later.
The node (1: nodes_B)
has depth 3, which is less than the depth of target node, but more than the configured maximum depth of 2; thus it is not explored either.