Leiden

Introduction

The Leiden algorithm is an algorithm for detecting communities in large networks. The algorithm separates nodes into disjoint communities so as to maximize a modularity score for each community. Modularity quantifies the quality of an assignment of nodes to communities, that is how densely connected nodes in a community are, compared to how connected they would be in a random network.

The Leiden algorithm is a hierarchical clustering algorithm, that recursively merges communities into single nodes by greedily optimizing the modularity and the process repeats in the condensed graph. It modifies the Louvain algorithm to address some of its shortcomings, namely the case where some of the communities found by Louvain are not well-connected. This is achieved by periodically randomly breaking down communities into smaller well-connected ones.

For more information on this algorithm, see:

V.A. Traag, L. Waltman and N.J. van Eck "From Louvain to Leiden: guaranteeing well-connected communities"

Syntax

This section covers the syntax used to execute the Leiden algorithm.

Run Leiden.

CALL Neo4j_Graph_Analytics.graph.leiden(
  'CPU_X64_XS',                    (1)
  {
    ['defaultTablePrefix': '...',] (2)
    'project': {...},              (3)
    'compute': {...},              (4)
    'write':   {...}               (5)
  }
);

1	Compute pool selector.
2	Optional prefix for table references.
3	Project config.
4	Compute config.
5	Write config.

Table 1. Parameters
Name	Type	Default	Optional	Description
computePoolSelector	String	`n/a`	no	The selector for the compute pool on which to run the Leiden job.
configuration	Map	`{}`	no	Configuration for graph project, algorithm compute and result write back.

The configuration map consists of the following three entries.

For more details on below Project configuration, refer to the Project documentation.

Table 2. Project configuration
Name	Type
nodeTables	List of node tables.
relationshipTables	Map of relationship types to relationship tables.

Table 3. Compute configuration
Name	Type	Default	Optional	Description
mutateProperty	String	`'community'`	yes	The node property that will be written back to the Snowflake database.
relationshipWeightProperty	String	`null`	yes	Name of the relationship property to use as weights. If unspecified, the algorithm runs unweighted.
seedProperty	String	`n/a`	yes	Used to set the initial community for a node. The property value needs to be a non-negative number.
maxLevels	Integer	`10`	yes	The maximum number of levels in which the graph is clustered and then condensed.
tolerance	Float	`0.0001`	yes	Minimum change in modularity between iterations. If the modularity changes less than the tolerance value, the result is considered stable and the algorithm returns.
includeIntermediateCommunities	Boolean	`false`	yes	Indicates whether to write intermediate communities. If set to false, only the final community is persisted.
gamma	Float	`1.0`	yes	Resolution parameter used when computing the modularity. Internally the value is divided by the number of relationships for an unweighted graph, or the sum of weights of all relationships otherwise. ^[1]
theta	Float	`0.01`	yes	Controls the randomness while breaking a community into smaller ones.
1. Higher resolutions lead to more communities, while lower resolutions lead to fewer communities.

For more details on below Write configuration, refer to the Write documentation.

Table 4. Write configuration
Name	Type	Default	Optional	Description
nodeProperty	String	`'community'`	yes	The node property that will be written back to the Snowflake database.

Examples

In this section we will show examples of running the Leiden community detection algorithm on a concrete graph. The intention is to illustrate what the results look like and to provide a guide in how to make use of the algorithm in a real setting. We will do this on a small social network graph of a handful of nodes connected in a particular pattern. The example graph looks like this:

The following SQL statement will create the example graph tables in the Snowflake database:

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.USERS (NODEID VARCHAR, SEED NUMBER);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.USERS VALUES
  ('Alice', 42),
  ('Bridget', 42),
  ('Charles', 42),
  ('Doug', NULL),
  ('Mark', NULL),
  ('Michael', NULL);

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.LINKS (SOURCENODEID VARCHAR, TARGETNODEID VARCHAR, WEIGHT FLOAT);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.LINKS VALUES
  ('Alice',   'Bridget', 1),
  ('Alice',   'Charles', 1),
  ('Charles', 'Bridget', 1),
  ('Alice',   'Doug',    5),
  ('Mark',    'Doug',    1),
  ('Mark',    'Michael', 1),
  ('Michael', 'Mark',    1);

This graph has two clusters of Users, that are closely connected. Between those clusters there is one single edge. The relationships that connect the nodes in each component have a property weight which determines the strength of the relationship.

We load the LINK relationships with orientation set to UNDIRECTED as this works best with the Leiden algorithm.

With the node and relationship tables in Snowflake we can now project it as part of an algorithm job. In the following examples we will demonstrate using the Leiden algorithm on this graph.

Run job

Running a Leiden job involves the three steps: Project, Compute and Write.

To run the query, there is a required setup of grants for the application, your consumer role and your environment. Please see the Getting started page for more on this.

We also assume that the application name is the default Neo4j_Graph_Analytics. If you chose a different app name during installation, please replace it with that.

The following will run a Leiden job:

CALL Neo4j_Graph_Analytics.graph.leiden('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS',
                'orientation': 'UNDIRECTED'
            }
        }
    },
    'compute': {
        'randomSeed': 19
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY'
    }]
});

The returned result contains information about the job execution and result distribution. Additionally, the community ID for each of the nodes has been written back to the Snowflake database. We can query it like so:

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY;

Table 5. Results
NODEID	COMMUNITY
Alice	2
Bridget	2
Charles	2
Doug	5
Mark	5
Michael	5

Except for the random seed, we use default values for the procedure configuration parameters. The maxLevels is set to 10, and the gamma, theta parameters are set to 1.0 and 0.01 respectively.

Weighted

The Leiden algorithm can also run on weighted graphs, taking the given relationship weights into concern when calculating the modularity.

The following will run the algorithm on a weighted graph:

CALL Neo4j_Graph_Analytics.graph.leiden('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS',
                'orientation': 'UNDIRECTED'
            }
        }
    },
    'compute': {
        'randomSeed': 19,
        'relationshipWeightProperty': 'WEIGHT'
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY'
    }]
});

Table 6. Results
JOB_ID	JOB_START	JOB_END	JOB_RESULT
job_7783bee73d084df19e254550b9a3a186	2025-07-16 08:56:33.449	2025-07-16 08:56:38.060	{ "leiden_1": { "communityCount": 3, "communityDistribution": { "max": 2, "mean": 2, "min": 2, "p1": 2, "p10": 2, "p25": 2, "p5": 2, "p50": 2, "p75": 2, "p90": 2, "p95": 2, "p99": 2, "p999": 2 }, "computeMillis": 71, "configuration": { "concurrency": 6, "consecutiveIds": false, "gamma": 1, "includeIntermediateCommunities": false, "jobId": "a55c6ad2-1567-4a71-812a-3a8651da2575", "logProgress": true, "maxLevels": 10, "mutateProperty": "community", "nodeLabels": [ "" ], "randomSeed": 19, "relationshipTypes": [ "" ], "relationshipWeightProperty": "WEIGHT", "seedProperty": null, "sudo": false, "theta": 0.01, "tolerance": 1.000000000000000e-04 }, "didConverge": true, "modularities": [ 0.2933884297520661 ], "modularity": 0.2933884297520661, "mutateMillis": 2, "nodeCount": 6, "nodePropertiesWritten": 6, "postProcessingMillis": 30, "preProcessingMillis": 8, "ranLevels": 1 }, "project_1": { "graphName": "snowgraph", "nodeCount": 6, "nodeMillis": 138, "relationshipCount": 14, "relationshipMillis": 303, "totalMillis": 441 }, "write_node_property_1": { "copyIntoTableMillis": 1109, "exportMillis": 1826, "nodeLabel": "USERS", "nodeProperty": "community", "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY", "propertiesExported": 6, "stageUploadMillis": 518 } }

Table 6. Results

JOB_ID

JOB_START

JOB_END

JOB_RESULT

job_7783bee73d084df19e254550b9a3a186

2025-07-16 08:56:33.449

2025-07-16 08:56:38.060

 {
  "leiden_1": {
    "communityCount": 3,
    "communityDistribution": {
      "max": 2,
      "mean": 2,
      "min": 2,
      "p1": 2,
      "p10": 2,
      "p25": 2,
      "p5": 2,
      "p50": 2,
      "p75": 2,
      "p90": 2,
      "p95": 2,
      "p99": 2,
      "p999": 2
    },
    "computeMillis": 71,
    "configuration": {
      "concurrency": 6,
      "consecutiveIds": false,
      "gamma": 1,
      "includeIntermediateCommunities": false,
      "jobId": "a55c6ad2-1567-4a71-812a-3a8651da2575",
      "logProgress": true,
      "maxLevels": 10,
      "mutateProperty": "community",
      "nodeLabels": [
        "*"
      ],
      "randomSeed": 19,
      "relationshipTypes": [
        "*"
      ],
      "relationshipWeightProperty": "WEIGHT",
      "seedProperty": null,
      "sudo": false,
      "theta": 0.01,
      "tolerance": 1.000000000000000e-04
    },
    "didConverge": true,
    "modularities": [
      0.2933884297520661
    ],
    "modularity": 0.2933884297520661,
    "mutateMillis": 2,
    "nodeCount": 6,
    "nodePropertiesWritten": 6,
    "postProcessingMillis": 30,
    "preProcessingMillis": 8,
    "ranLevels": 1
  },
  "project_1": {
    "graphName": "snowgraph",
    "nodeCount": 6,
    "nodeMillis": 138,
    "relationshipCount": 14,
    "relationshipMillis": 303,
    "totalMillis": 441
  },
  "write_node_property_1": {
    "copyIntoTableMillis": 1109,
    "exportMillis": 1826,
    "nodeLabel": "USERS",
    "nodeProperty": "community",
    "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY",
    "propertiesExported": 6,
    "stageUploadMillis": 518
  }
}

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY;

Table 7. Results
NODEID	COMMUNITY_ID
Alice	3
Bridget	2
Charles	2
Doug	3
Mark	5
Michael	5

Using the weighted relationships, we see that Alice and Doug have formed their own community, as their link is much stronger than all the others.

Seeded

It is possible to run the Leiden algorithm incrementally, by providing a seed property. If specified, the seed property provides an initial community mapping for a subset of the loaded nodes. The algorithm will try to keep the seeded community IDs.

The following will run the algorithm with seeded communities:

CALL Neo4j_Graph_Analytics.graph.leiden('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS',
                'orientation': 'UNDIRECTED'
            }
        }
    },
    'compute': {
        'randomSeed': 19,
        'seedProperty': 'SEED'
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY'
    }]
});

Table 8. Results
JOB_ID	JOB_START	JOB_END	JOB_RESULT
job_79891de200694d55a9b6822a5a9c8993	2025-07-16 09:25:30.436	2025-07-16 09:25:35.139	{ "leiden_1": { "communityCount": 2, "communityDistribution": { "max": 3, "mean": 3, "min": 3, "p1": 3, "p10": 3, "p25": 3, "p5": 3, "p50": 3, "p75": 3, "p90": 3, "p95": 3, "p99": 3, "p999": 3 }, "computeMillis": 48, "configuration": { "concurrency": 6, "consecutiveIds": false, "gamma": 1, "includeIntermediateCommunities": false, "jobId": "3f7cf812-cb71-4312-9662-7e6a3a1ec9d5", "logProgress": true, "maxLevels": 10, "mutateProperty": "community", "nodeLabels": [ "" ], "randomSeed": 19, "relationshipTypes": [ "" ], "seedProperty": "SEED", "sudo": false, "theta": 0.01, "tolerance": 1.000000000000000e-04 }, "didConverge": true, "modularities": [ 0.3571428571428571 ], "modularity": 0.3571428571428571, "mutateMillis": 1, "nodeCount": 6, "nodePropertiesWritten": 6, "postProcessingMillis": 20, "preProcessingMillis": 6, "ranLevels": 1 }, "project_1": { "graphName": "snowgraph", "nodeCount": 6, "nodeMillis": 136, "relationshipCount": 14, "relationshipMillis": 342, "totalMillis": 478 }, "write_node_property_1": { "copyIntoTableMillis": 978, "exportMillis": 1730, "nodeLabel": "USERS", "nodeProperty": "community", "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY", "propertiesExported": 6, "stageUploadMillis": 536 } }

Table 8. Results

JOB_ID

JOB_START

JOB_END

JOB_RESULT

job_79891de200694d55a9b6822a5a9c8993

2025-07-16 09:25:30.436

2025-07-16 09:25:35.139

 {
  "leiden_1": {
    "communityCount": 2,
    "communityDistribution": {
      "max": 3,
      "mean": 3,
      "min": 3,
      "p1": 3,
      "p10": 3,
      "p25": 3,
      "p5": 3,
      "p50": 3,
      "p75": 3,
      "p90": 3,
      "p95": 3,
      "p99": 3,
      "p999": 3
    },
    "computeMillis": 48,
    "configuration": {
      "concurrency": 6,
      "consecutiveIds": false,
      "gamma": 1,
      "includeIntermediateCommunities": false,
      "jobId": "3f7cf812-cb71-4312-9662-7e6a3a1ec9d5",
      "logProgress": true,
      "maxLevels": 10,
      "mutateProperty": "community",
      "nodeLabels": [
        "*"
      ],
      "randomSeed": 19,
      "relationshipTypes": [
        "*"
      ],
      "seedProperty": "SEED",
      "sudo": false,
      "theta": 0.01,
      "tolerance": 1.000000000000000e-04
    },
    "didConverge": true,
    "modularities": [
      0.3571428571428571
    ],
    "modularity": 0.3571428571428571,
    "mutateMillis": 1,
    "nodeCount": 6,
    "nodePropertiesWritten": 6,
    "postProcessingMillis": 20,
    "preProcessingMillis": 6,
    "ranLevels": 1
  },
  "project_1": {
    "graphName": "snowgraph",
    "nodeCount": 6,
    "nodeMillis": 136,
    "relationshipCount": 14,
    "relationshipMillis": 342,
    "totalMillis": 478
  },
  "write_node_property_1": {
    "copyIntoTableMillis": 978,
    "exportMillis": 1730,
    "nodeLabel": "USERS",
    "nodeProperty": "community",
    "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY",
    "propertiesExported": 6,
    "stageUploadMillis": 536
  }
}

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY;

Table 9. Results
NODEID	COMMUNITY_ID
Alice	42
Bridget	42
Charles	42
Doug	45
Mark	45
Michael	45

As can be seen, using the seeded graph, node Alice keeps its initial community ID of 42. The other community has been assigned a new community ID which is guaranteed to be larger than the largest seeded community ID.

Using intermediate communities

As described before, Leiden is a hierarchical clustering algorithm. That means that after every clustering step all nodes that belong to the same cluster are reduced to a single node. Relationships between nodes of the same cluster become self-relationships, relationships to nodes of other clusters connect to the clusters representative. This condensed graph is then used to run the next level of clustering. The process is repeated until the clusters are stable.

In order to demonstrate this iterative behavior, we need to construct a more complex graph.

The following SQL statements will create a multi-level graph, encoded in Snowflake tables:

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.NODES (NODEID STRING);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.NODES VALUES
  ('a'),
  ('b'),
  ('c'),
  ('d'),
  ('e'),
  ('f'),
  ('g'),
  ('h'),
  ('i'),
  ('j'),
  ('k'),
  ('l'),
  ('m'),
  ('n'),
  ('x');

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.TYPES (SOURCENODEID STRING, TARGETNODEID STRING);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.TYPES VALUES
  ('a', 'b'),
  ('a', 'd'),
  ('a', 'f'),
  ('b', 'd'),
  ('b', 'x'),
  ('b', 'g'),
  ('b', 'e'),
  ('c', 'x'),
  ('c', 'f'),
  ('d', 'k'),
  ('e', 'x'),
  ('e', 'f'),
  ('e', 'h'),
  ('f', 'g'),
  ('g', 'h'),
  ('h', 'i'),
  ('h', 'j'),
  ('i', 'k'),
  ('j', 'k'),
  ('j', 'm'),
  ('j', 'n'),
  ('k', 'm'),
  ('k', 'l'),
  ('l', 'n'),
  ('m', 'n');

Now we can see the iterative flow of the algorithm:

The following runs the algorithm and outputs the intermediate communities:

CALL Neo4j_Graph_Analytics.graph.leiden('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'NODES' ],
        'relationshipTables': {
            'TYPES': {
                'sourceTable': 'NODES',
                'targetTable': 'NODES',
                'orientation': 'UNDIRECTED'
            }
        }
    },
    'compute': {
        'randomSeed': 23,
        'includeIntermediateCommunities': true
    },
    'write': [{
        'nodeLabel': 'NODES',
        'outputTable': 'NODES_INTERMEDIATE_COMMUNITY'
    }]
});

Table 10. Results
JOB_ID	JOB_START	JOB_END	JOB_RESULT
job_d2aa3973d9744157b2a6be3cc30a3ee6	2025-07-16 09:09:59.400	2025-07-16 09:10:05.025	{ "leiden_1": { "communityCount": 3, "communityDistribution": { "max": 7, "mean": 5, "min": 3, "p1": 3, "p10": 3, "p25": 3, "p5": 3, "p50": 5, "p75": 7, "p90": 7, "p95": 7, "p99": 7, "p999": 7 }, "computeMillis": 108, "configuration": { "concurrency": 6, "consecutiveIds": false, "gamma": 1, "includeIntermediateCommunities": true, "jobId": "b6d95032-1a8c-41e9-91d9-4054aa328f9b", "logProgress": true, "maxLevels": 10, "mutateProperty": "community", "nodeLabels": [ "" ], "randomSeed": 19, "relationshipTypes": [ "" ], "seedProperty": null, "sudo": false, "theta": 0.01, "tolerance": 1.000000000000000e-04 }, "didConverge": true, "modularities": [ 0.37599999999999995, 0.3816 ], "modularity": 0.3816, "mutateMillis": 2, "nodeCount": 15, "nodePropertiesWritten": 15, "postProcessingMillis": 18, "preProcessingMillis": 8, "ranLevels": 2 }, "project_1": { "graphName": "snowgraph", "nodeCount": 15, "nodeMillis": 393, "relationshipCount": 50, "relationshipMillis": 529, "totalMillis": 922 }, "write_node_property_1": { "copyIntoTableMillis": 911, "exportMillis": 1767, "nodeLabel": "NODES", "nodeProperty": "community", "outputTable": "EXAMPLE_DB.DATA_SCHEMA.NODES_INTERMEDIATE_COMMUNITY", "propertiesExported": 15, "stageUploadMillis": 582 } }

Table 10. Results

JOB_ID

JOB_START

JOB_END

JOB_RESULT

job_d2aa3973d9744157b2a6be3cc30a3ee6

2025-07-16 09:09:59.400

2025-07-16 09:10:05.025

 {
  "leiden_1": {
    "communityCount": 3,
    "communityDistribution": {
      "max": 7,
      "mean": 5,
      "min": 3,
      "p1": 3,
      "p10": 3,
      "p25": 3,
      "p5": 3,
      "p50": 5,
      "p75": 7,
      "p90": 7,
      "p95": 7,
      "p99": 7,
      "p999": 7
    },
    "computeMillis": 108,
    "configuration": {
      "concurrency": 6,
      "consecutiveIds": false,
      "gamma": 1,
      "includeIntermediateCommunities": true,
      "jobId": "b6d95032-1a8c-41e9-91d9-4054aa328f9b",
      "logProgress": true,
      "maxLevels": 10,
      "mutateProperty": "community",
      "nodeLabels": [
        "*"
      ],
      "randomSeed": 19,
      "relationshipTypes": [
        "*"
      ],
      "seedProperty": null,
      "sudo": false,
      "theta": 0.01,
      "tolerance": 1.000000000000000e-04
    },
    "didConverge": true,
    "modularities": [
      0.37599999999999995,
      0.3816
    ],
    "modularity": 0.3816,
    "mutateMillis": 2,
    "nodeCount": 15,
    "nodePropertiesWritten": 15,
    "postProcessingMillis": 18,
    "preProcessingMillis": 8,
    "ranLevels": 2
  },
  "project_1": {
    "graphName": "snowgraph",
    "nodeCount": 15,
    "nodeMillis": 393,
    "relationshipCount": 50,
    "relationshipMillis": 529,
    "totalMillis": 922
  },
  "write_node_property_1": {
    "copyIntoTableMillis": 911,
    "exportMillis": 1767,
    "nodeLabel": "NODES",
    "nodeProperty": "community",
    "outputTable": "EXAMPLE_DB.DATA_SCHEMA.NODES_INTERMEDIATE_COMMUNITY",
    "propertiesExported": 15,
    "stageUploadMillis": 582
  }
}

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.NODES_INTERMEDIATE_COMMUNITY;

Table 11. Results
NODEID	INTERMEDIATE_COMMUNITIES
a	[3, 1]
b	[3, 1]
c	[14, 1]
d	[3, 1]
e	[14, 1]
f	[14, 1]
g	[8, 2]
h	[8, 2]
i	[8, 2]
j	[12, 3]
k	[12, 3]
l	[12, 3]
m	[12, 3]
n	[12, 3]
x	[14, 1]

In this example graph, after the first iteration we see 4 clusters, which in the second iteration are reduced to three.