Label Propagation

Introduction

The Label Propagation algorithm (LPA) is a fast algorithm for finding communities in a graph. It detects these communities using network structure alone as its guide, and doesn’t require a pre-defined objective function or prior information about the communities.

LPA works by propagating labels throughout the network and forming communities based on this process of label propagation.

The intuition behind the algorithm is that a single label can quickly become dominant in a densely connected group of nodes, but will have trouble crossing a sparsely connected region. Labels will get trapped inside a densely connected group of nodes, and those nodes that end up with the same label when the algorithms finish can be considered part of the same community.

The algorithm works as follows:

  • Every node is initialized with a unique community label (an identifier).

  • These labels propagate through the network.

  • At every iteration of propagation, each node updates its label to the one that the maximum numbers of its neighbours belongs to. Ties are broken arbitrarily but deterministically.

  • LPA reaches convergence when each node has the majority label of its neighbours.

  • LPA stops if either convergence, or the user-defined maximum number of iterations is achieved.

As labels propagate, densely connected groups of nodes quickly reach a consensus on a unique label. At the end of the propagation only a few labels will remain - most will have disappeared. Nodes that have the same community label at convergence are said to belong to the same community.

One interesting feature of LPA is that nodes can be assigned preliminary labels to narrow down the range of solutions generated. This means that it can be used as semi-supervised way of finding communities where we hand-pick some initial communities.

For more information on this algorithm, see:

Syntax

This section covers the syntax used to execute the Label Propagation algorithm.

Run Label Propagation.
CALL Neo4j_Graph_Analytics.graph.label_propagation(
  'CPU_X64_XS',                    (1)
  {
    ['defaultTablePrefix': '...',] (2)
    'project': {...},              (3)
    'compute': {...},              (4)
    'write':   {...}               (5)
  }
);
1 Compute pool selector.
2 Optional prefix for table references.
3 Project config.
4 Compute config.
5 Write config.
Table 1. Parameters
Name Type Default Optional Description

computePoolSelector

String

n/a

no

The selector for the compute pool on which to run the Label Propagation job.

configuration

Map

{}

no

Configuration for graph project, algorithm compute and result write back.

The configuration map consists of the following three entries.

For more details on below Project configuration, refer to the Project documentation.
Table 2. Project configuration
Name Type

nodeTables

List of node tables.

relationshipTables

Map of relationship types to relationship tables.

Table 3. Compute configuration
Name Type Default Optional Description

resultProperty

String

'community'

yes

The node property that will be written back to the Snowflake database.

nodeWeightProperty

String

null

yes

The name of a node property that contains node weights.

relationshipWeightProperty

String

null

yes

Name of the relationship property to use as weights. If unspecified, the algorithm runs unweighted.

seedProperty

String

n/a

yes

Used to set the initial community for a node. The property value needs to be a non-negative number.

maxIterations

Integer

10

yes

The maximum number of iterations to run.

For more details on below Write configuration, refer to the Write documentation.
Table 4. Write configuration
Name Type Default Optional Description

nodeProperty

String

'community'

yes

The node property that will be written back to the Snowflake database.

Examples

In this section we will show examples of running the Label Propagation algorithm on a concrete graph. The intention is to illustrate what the results look like and to provide a guide on how to make use of the algorithm in a real setting. We will do this on a small social network graph of a handful of nodes connected in a particular pattern. The example graph looks like this:

Visualization of the example graph
The following SQL statement will create the example graph tables in the Snowflake database:
CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.USERS (NODEID VARCHAR, SEED NUMBER);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.USERS VALUES
  ('Alice', 42),
  ('Bridget', 42),
  ('Charles', 42),
  ('Doug', NULL),
  ('Mark', NULL),
  ('Michael', NULL);

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.LINKS (SOURCENODEID VARCHAR, TARGETNODEID VARCHAR, WEIGHT FLOAT);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.LINKS VALUES
  ('Alice',   'Bridget', 1),
  ('Alice',   'Charles', 1),
  ('Charles', 'Bridget', 1),
  ('Alice',   'Doug',    5),
  ('Mark',    'Doug',    1),
  ('Mark',    'Michael', 1),
  ('Michael', 'Mark',    1);

This graph represents six users, some of whom follow each other. Besides a name property, each user also has a seed property. The seed property represents a value in the graph used to seed the node with a label. For example, this can be a result from a previous run of the Label Propagation algorithm. In addition, each relationship has a weight property.

With the node and relationship tables in Snowflake we can now project it as part of an algorithm job. In the following examples we will demonstrate using the Label Propagation algorithm on this graph.

Run job

Running a Label Propagation algorithm job involves three steps: Project, Compute and Write.

To run the query, there is a required setup of grants for the application, your consumer role and your environment. Please see the Getting started page for more on this.

We also assume that the application name is the default Neo4j_Graph_Analytics. If you chose a different app name during installation, please replace it with that.

The following will run a Label Propagation algorithm job:
CALL Neo4j_Graph_Analytics.graph.label_propagation('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS'
            }
        }
    },
    'compute': {
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY'
    }]
});
Table 5. Results
JOB_ID JOB_STATUS JOB_START JOB_END JOB_RESULT

job_b1e167d1f8e64e229e4af0188072c61e

SUCCESS

2026-05-11 10:49:49.953

2026-05-11 10:49:58.546

 {
  "label_propagation_1": {
    "communityCount": 2,
    "communityDistribution": {
      "max": 3,
      "mean": 3,
      "min": 3,
      "p1": 3,
      "p10": 3,
      "p25": 3,
      "p5": 3,
      "p50": 3,
      "p75": 3,
      "p90": 3,
      "p95": 3,
      "p99": 3,
      "p999": 3
    },
    "computeMillis": 35,
    "configuration": {
      "concurrency": 2,
      "consecutiveIds": false,
      "maxIterations": 10,
      "nodeLabels": [
        "*"
      ],
      "nodeWeightProperty": null,
      "relationshipTypes": [
        "*"
      ],
      "resultProperty": "community",
      "seedProperty": null
    },
    "didConverge": true,
    "ranIterations": 3
  },
  "project_graph_1": {
    "graphName": "snowgraph",
    "nodeCount": 6,
    "nodeLabels": {
      "NODE": {
        "count": 6,
        "nodeId": {
          "dataType": "LONG"
        },
        "properties": {
          "POSTS": {
            "dataType": "LONG"
          },
          "SEED_LABEL": {
            "dataType": "LONG"
          }
        },
        "table": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "nodeMillis": 340,
    "relationshipCount": 10,
    "relationshipMillis": 931,
    "relationshipTypes": {
      "RELATIONSHIP": {
        "count": 10,
        "direction": "DIRECTED",
        "properties": {
          "WEIGHT": {
            "dataType": "DOUBLE"
          }
        },
        "sourceTable": "EXAMPLE_DB.DATA_SCHEMA.USERS",
        "targetTable": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "totalMillis": 1271
  },
  "write_node_property_1": {
    "copyIntoTableMillis": 969,
    "nodeLabel": "USERS",
    "nodeProperty": "community",
    "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY",
    "rowsWritten": 6,
    "stageUploadMillis": 2157,
    "writeMillis": 3320
  }
}

In the above example we can see that our graph has two communities each containing three nodes. The default behaviour of the algorithm is to run unweighted, e.g. without using node or relationship weights. The weighted option will be demonstrated in Weighted

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY;
Table 6. Results
NODEID COMMUNITY

Alice

1

Bridget

1

Charles

4

Doug

4

Mark

4

Michael

1

We use default values for the procedure configuration parameters.

Weighted

The Label Propagation algorithm can also run on weighted graphs. To tell the algorithm to use the projected relationship weights, set the relationshipWeightProperty configuration parameter.

The following will run the algorithm on a weighted graph:
CALL Neo4j_Graph_Analytics.graph.label_propagation('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS',
            }
        }
    },
    'compute': {
        'relationshipWeightProperty': 'WEIGHT'
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY_WEIGHTED'
    }]
});
Table 7. Results
JOB_ID JOB_STATUS JOB_START JOB_END JOB_RESULT

job_b198320e023c42d8823da59e1f8ce3d0

SUCCESS

2026-05-11 10:23:29.793

2026-05-11 10:23:39.255

 {
  "label_propagation_1": {
    "communityCount": 2,
    "communityDistribution": {
      "max": 4,
      "mean": 3,
      "min": 2,
      "p1": 2,
      "p10": 2,
      "p25": 2,
      "p5": 2,
      "p50": 2,
      "p75": 4,
      "p90": 4,
      "p95": 4,
      "p99": 4,
      "p999": 4
    },
    "computeMillis": 85,
    "configuration": {
      "concurrency": 2,
      "consecutiveIds": false,
      "maxIterations": 10,
      "nodeLabels": [
        "*"
      ],
      "nodeWeightProperty": null,
      "relationshipTypes": [
        "*"
      ],
      "relationshipWeightProperty": "WEIGHT",
      "resultProperty": "community",
      "seedProperty": null
    },
    "didConverge": true,
    "ranIterations": 4
  },
  "project_graph_1": {
    "graphName": "snowgraph",
    "nodeCount": 6,
    "nodeLabels": {
      "NODE": {
        "count": 6,
        "nodeId": {
          "dataType": "LONG"
        },
        "properties": {
          "POSTS": {
            "dataType": "LONG"
          },
          "SEED_LABEL": {
            "dataType": "LONG"
          }
        },
        "table": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "nodeMillis": 403,
    "relationshipCount": 10,
    "relationshipMillis": 810,
    "relationshipTypes": {
      "RELATIONSHIP": {
        "count": 10,
        "direction": "DIRECTED",
        "properties": {
          "WEIGHT": {
            "dataType": "DOUBLE"
          }
        },
        "sourceTable": "EXAMPLE_DB.DATA_SCHEMA.USERS",
        "targetTable": "EXAMPLE_DB.DATA_SCHEMA.USERS"
      }
    },
    "totalMillis": 1213
  },
  "write_node_property_1": {
    "copyIntoTableMillis": 1083,
    "nodeLabel": "USERS",
    "nodeProperty": "community",
    "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY_WEIGHTED",
    "rowsWritten": 6,
    "stageUploadMillis": 2026,
    "writeMillis": 3373
  }
}
SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY_WEIGHTED;
Table 8. Results
NODEID COMMUNITY_ID

Alice

4

Bridget

2

Charles

4

Doug

4

Mark

4

Michael

2

Compared to the unweighted run of the algorithm we still have two communities, but they contain two and four nodes respectively. Using the weighted relationships, the nodes Alice and Charles are now in the same community as there is a strong link between them.