Label Propagation

Introduction

The Label Propagation algorithm (LPA) is a fast algorithm for finding communities in a graph. It detects these communities using network structure alone as its guide, and doesn’t require a pre-defined objective function or prior information about the communities.

LPA works by propagating labels throughout the network and forming communities based on this process of label propagation.

The intuition behind the algorithm is that a single label can quickly become dominant in a densely connected group of nodes, but will have trouble crossing a sparsely connected region. Labels will get trapped inside a densely connected group of nodes, and those nodes that end up with the same label when the algorithms finish can be considered part of the same community.

The algorithm works as follows:

Every node is initialized with a unique community label (an identifier).
These labels propagate through the network.
At every iteration of propagation, each node updates its label to the one that the maximum numbers of its neighbours belongs to. Ties are broken arbitrarily but deterministically.
LPA reaches convergence when each node has the majority label of its neighbours.
LPA stops if either convergence, or the user-defined maximum number of iterations is achieved.

As labels propagate, densely connected groups of nodes quickly reach a consensus on a unique label. At the end of the propagation only a few labels will remain - most will have disappeared. Nodes that have the same community label at convergence are said to belong to the same community.

One interesting feature of LPA is that nodes can be assigned preliminary labels to narrow down the range of solutions generated. This means that it can be used as semi-supervised way of finding communities where we hand-pick some initial communities.

For more information on this algorithm, see:

Syntax

This section covers the syntax used to execute the Label Propagation algorithm.

Run Label Propagation.

CALL Neo4j_Graph_Analytics.graph.label_propagation(
  'CPU_X64_XS',                    (1)
  {
    ['defaultTablePrefix': '...',] (2)
    'project': {...},              (3)
    'compute': {...},              (4)
    'write':   {...}               (5)
  }
);

1	Compute pool selector.
2	Optional prefix for table references.
3	Project config.
4	Compute config.
5	Write config.

Table 1. Parameters
Name	Type	Default	Optional	Description
computePoolSelector	String	`n/a`	no	The selector for the compute pool on which to run the Label Propagation job.
configuration	Map	`{}`	no	Configuration for graph project, algorithm compute and result write back.

The configuration map consists of the following three entries.

For more details on below Project configuration, refer to the Project documentation.

Table 2. Project configuration
Name	Type
nodeTables	List of node tables.
relationshipTables	Map of relationship types to relationship tables.

Table 3. Compute configuration
Name	Type	Default	Optional	Description
resultProperty	String	`'community'`	yes	The node property that will be written back to the Snowflake database.
nodeWeightProperty	String	`null`	yes	The name of a node property that contains node weights.
relationshipWeightProperty	String	`null`	yes	Name of the relationship property to use as weights. If unspecified, the algorithm runs unweighted.
seedProperty	String	`n/a`	yes	Used to set the initial community for a node. The property value needs to be a non-negative number.
maxIterations	Integer	`10`	yes	The maximum number of iterations to run.

For more details on below Write configuration, refer to the Write documentation.

Table 4. Write configuration
Name	Type	Default	Optional	Description
nodeProperty	String	`'community'`	yes	The node property that will be written back to the Snowflake database.

Examples

In this section we will show examples of running the Label Propagation algorithm on a concrete graph. The intention is to illustrate what the results look like and to provide a guide on how to make use of the algorithm in a real setting. We will do this on a small social network graph of a handful of nodes connected in a particular pattern. The example graph looks like this:

The following SQL statement will create the example graph tables in the Snowflake database:

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.USERS (NODEID VARCHAR, SEED NUMBER);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.USERS VALUES
  ('Alice', 42),
  ('Bridget', 42),
  ('Charles', 42),
  ('Doug', NULL),
  ('Mark', NULL),
  ('Michael', NULL);

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.LINKS (SOURCENODEID VARCHAR, TARGETNODEID VARCHAR, WEIGHT FLOAT);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.LINKS VALUES
  ('Alice',   'Bridget', 1),
  ('Alice',   'Charles', 1),
  ('Charles', 'Bridget', 1),
  ('Alice',   'Doug',    5),
  ('Mark',    'Doug',    1),
  ('Mark',    'Michael', 1),
  ('Michael', 'Mark',    1);

This graph represents six users, some of whom follow each other. Besides a name property, each user also has a seed property. The seed property represents a value in the graph used to seed the node with a label. For example, this can be a result from a previous run of the Label Propagation algorithm. In addition, each relationship has a weight property.

With the node and relationship tables in Snowflake we can now project it as part of an algorithm job. In the following examples we will demonstrate using the Label Propagation algorithm on this graph.

Run job

Running a Label Propagation algorithm job involves three steps: Project, Compute and Write.

To run the query, there is a required setup of grants for the application, your consumer role and your environment. Please see the Getting started page for more on this.

We also assume that the application name is the default Neo4j_Graph_Analytics. If you chose a different app name during installation, please replace it with that.

The following will run a Label Propagation algorithm job:

CALL Neo4j_Graph_Analytics.graph.label_propagation('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS'
            }
        }
    },
    'compute': {
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY'
    }]
});

Table 5. Results
JOB_ID	JOB_STATUS	JOB_START	JOB_END	JOB_RESULT
job_b1e167d1f8e64e229e4af0188072c61e	SUCCESS	2026-05-11 10:49:49.953	2026-05-11 10:49:58.546	{ "label_propagation_1": { "communityCount": 2, "communityDistribution": { "max": 3, "mean": 3, "min": 3, "p1": 3, "p10": 3, "p25": 3, "p5": 3, "p50": 3, "p75": 3, "p90": 3, "p95": 3, "p99": 3, "p999": 3 }, "computeMillis": 35, "configuration": { "concurrency": 2, "consecutiveIds": false, "maxIterations": 10, "nodeLabels": [ "" ], "nodeWeightProperty": null, "relationshipTypes": [ "" ], "resultProperty": "community", "seedProperty": null }, "didConverge": true, "ranIterations": 3 }, "project_graph_1": { "graphName": "snowgraph", "nodeCount": 6, "nodeLabels": { "NODE": { "count": 6, "nodeId": { "dataType": "LONG" }, "properties": { "POSTS": { "dataType": "LONG" }, "SEED_LABEL": { "dataType": "LONG" } }, "table": "EXAMPLE_DB.DATA_SCHEMA.USERS" } }, "nodeMillis": 340, "relationshipCount": 10, "relationshipMillis": 931, "relationshipTypes": { "RELATIONSHIP": { "count": 10, "direction": "DIRECTED", "properties": { "WEIGHT": { "dataType": "DOUBLE" } }, "sourceTable": "EXAMPLE_DB.DATA_SCHEMA.USERS", "targetTable": "EXAMPLE_DB.DATA_SCHEMA.USERS" } }, "totalMillis": 1271 }, "write_node_property_1": { "copyIntoTableMillis": 969, "nodeLabel": "USERS", "nodeProperty": "community", "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY", "rowsWritten": 6, "stageUploadMillis": 2157, "writeMillis": 3320 } }

Table 5. Results

JOB_ID

JOB_STATUS

JOB_START

JOB_END

JOB_RESULT

job_b1e167d1f8e64e229e4af0188072c61e

SUCCESS

2026-05-11 10:49:49.953

2026-05-11 10:49:58.546

 {
  "label_propagation_1": {
    "communityCount": 2,
    "communityDistribution": {
      "max": 3,
      "mean": 3,
      "min": 3,
      "p1": 3,
      "p10": 3,
      "p25": 3,
      "p5": 3,
      "p50": 3,
      "p75": 3,
      "p90": 3,
      "p95": 3,
      "p99": 3,
      "p999": 3
    },
    "computeMillis": 35,
    "configuration": {
      "concurrency": 2,
      "consecutiveIds": false,
      "maxIterations": 10,
      "nodeLabels": [
        "*"
      ],
      "nodeWeightProperty": null,
      "relationshipTypes": [
        "*"
      ],
      "resultProperty": "community",
      "seedProperty": null
    },
    "didConverge": true,
    "ranIterations": 3
  },
  "project_graph_1": {
    "graphName": "snowgraph",
    "nodeCount": 6,
    "nodeLabels": {
      "NODE": {
        "count": 6,
        "nodeId": {
          "dataType": "LONG"
        },
        "properties": {
          "POSTS": {
            "dataType": "LONG"
          },
          "SEED_LABEL": {
            "dataType": "LONG"
          }
        },
        "table": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "nodeMillis": 340,
    "relationshipCount": 10,
    "relationshipMillis": 931,
    "relationshipTypes": {
      "RELATIONSHIP": {
        "count": 10,
        "direction": "DIRECTED",
        "properties": {
          "WEIGHT": {
            "dataType": "DOUBLE"
          }
        },
        "sourceTable": "EXAMPLE_DB.DATA_SCHEMA.USERS",
        "targetTable": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "totalMillis": 1271
  },
  "write_node_property_1": {
    "copyIntoTableMillis": 969,
    "nodeLabel": "USERS",
    "nodeProperty": "community",
    "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY",
    "rowsWritten": 6,
    "stageUploadMillis": 2157,
    "writeMillis": 3320
  }
}

In the above example we can see that our graph has two communities each containing three nodes. The default behaviour of the algorithm is to run unweighted, e.g. without using node or relationship weights. The weighted option will be demonstrated in Weighted

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY;

Table 6. Results
NODEID	COMMUNITY
Alice	1
Bridget	1
Charles	4
Doug	4
Mark	4
Michael	1

We use default values for the procedure configuration parameters.

Weighted

The Label Propagation algorithm can also run on weighted graphs. To tell the algorithm to use the projected relationship weights, set the relationshipWeightProperty configuration parameter.

The following will run the algorithm on a weighted graph:

CALL Neo4j_Graph_Analytics.graph.label_propagation('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS',
            }
        }
    },
    'compute': {
        'relationshipWeightProperty': 'WEIGHT'
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY_WEIGHTED'
    }]
});

Table 7. Results
JOB_ID	JOB_STATUS	JOB_START	JOB_END	JOB_RESULT
job_b198320e023c42d8823da59e1f8ce3d0	SUCCESS	2026-05-11 10:23:29.793	2026-05-11 10:23:39.255	{ "label_propagation_1": { "communityCount": 2, "communityDistribution": { "max": 4, "mean": 3, "min": 2, "p1": 2, "p10": 2, "p25": 2, "p5": 2, "p50": 2, "p75": 4, "p90": 4, "p95": 4, "p99": 4, "p999": 4 }, "computeMillis": 85, "configuration": { "concurrency": 2, "consecutiveIds": false, "maxIterations": 10, "nodeLabels": [ "" ], "nodeWeightProperty": null, "relationshipTypes": [ "" ], "relationshipWeightProperty": "WEIGHT", "resultProperty": "community", "seedProperty": null }, "didConverge": true, "ranIterations": 4 }, "project_graph_1": { "graphName": "snowgraph", "nodeCount": 6, "nodeLabels": { "NODE": { "count": 6, "nodeId": { "dataType": "LONG" }, "properties": { "POSTS": { "dataType": "LONG" }, "SEED_LABEL": { "dataType": "LONG" } }, "table": "EXAMPLE_DB.DATA_SCHEMA.USERS" } }, "nodeMillis": 403, "relationshipCount": 10, "relationshipMillis": 810, "relationshipTypes": { "RELATIONSHIP": { "count": 10, "direction": "DIRECTED", "properties": { "WEIGHT": { "dataType": "DOUBLE" } }, "sourceTable": "EXAMPLE_DB.DATA_SCHEMA.USERS", "targetTable": "EXAMPLE_DB.DATA_SCHEMA.USERS" } }, "totalMillis": 1213 }, "write_node_property_1": { "copyIntoTableMillis": 1083, "nodeLabel": "USERS", "nodeProperty": "community", "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY_WEIGHTED", "rowsWritten": 6, "stageUploadMillis": 2026, "writeMillis": 3373 } }

Table 7. Results

JOB_ID

JOB_STATUS

JOB_START

JOB_END

JOB_RESULT

job_b198320e023c42d8823da59e1f8ce3d0

SUCCESS

2026-05-11 10:23:29.793

2026-05-11 10:23:39.255

 {
  "label_propagation_1": {
    "communityCount": 2,
    "communityDistribution": {
      "max": 4,
      "mean": 3,
      "min": 2,
      "p1": 2,
      "p10": 2,
      "p25": 2,
      "p5": 2,
      "p50": 2,
      "p75": 4,
      "p90": 4,
      "p95": 4,
      "p99": 4,
      "p999": 4
    },
    "computeMillis": 85,
    "configuration": {
      "concurrency": 2,
      "consecutiveIds": false,
      "maxIterations": 10,
      "nodeLabels": [
        "*"
      ],
      "nodeWeightProperty": null,
      "relationshipTypes": [
        "*"
      ],
      "relationshipWeightProperty": "WEIGHT",
      "resultProperty": "community",
      "seedProperty": null
    },
    "didConverge": true,
    "ranIterations": 4
  },
  "project_graph_1": {
    "graphName": "snowgraph",
    "nodeCount": 6,
    "nodeLabels": {
      "NODE": {
        "count": 6,
        "nodeId": {
          "dataType": "LONG"
        },
        "properties": {
          "POSTS": {
            "dataType": "LONG"
          },
          "SEED_LABEL": {
            "dataType": "LONG"
          }
        },
        "table": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "nodeMillis": 403,
    "relationshipCount": 10,
    "relationshipMillis": 810,
    "relationshipTypes": {
      "RELATIONSHIP": {
        "count": 10,
        "direction": "DIRECTED",
        "properties": {
          "WEIGHT": {
            "dataType": "DOUBLE"
          }
        },
        "sourceTable": "EXAMPLE_DB.DATA_SCHEMA.USERS",
        "targetTable": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "totalMillis": 1213
  },
  "write_node_property_1": {
    "copyIntoTableMillis": 1083,
    "nodeLabel": "USERS",
    "nodeProperty": "community",
    "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY_WEIGHTED",
    "rowsWritten": 6,
    "stageUploadMillis": 2026,
    "writeMillis": 3373
  }
}

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY_WEIGHTED;

Table 8. Results
NODEID	COMMUNITY_ID
Alice	4
Bridget	2
Charles	4
Doug	4
Mark	4
Michael	2

Compared to the unweighted run of the algorithm we still have two communities, but they contain two and four nodes respectively. Using the weighted relationships, the nodes Alice and Charles are now in the same community as there is a strong link between them.