Louvain

Introduction

The Louvain method is an algorithm to detect communities in large networks. It maximizes a modularity score for each community, where the modularity quantifies the quality of an assignment of nodes to communities. This means evaluating how much more densely connected the nodes within a community are, compared to how connected they would be in a random network.

The Louvain algorithm is a hierarchical clustering algorithm, which recursively merges communities into a single node and executes the modularity clustering on the condensed graphs.

For more information on this algorithm, see:

Syntax

This section covers the syntax used to execute the Louvain algorithm.

Run Louvain.
CALL Neo4j_Graph_Analytics.graph.louvain(
  'CPU_X64_XS',                    (1)
  {
    ['defaultTablePrefix': '...',] (2)
    'project': {...},              (3)
    'compute': {...},              (4)
    'write':   {...}               (5)
  }
);
1 Compute pool selector.
2 Optional prefix for table references.
3 Project config.
4 Compute config.
5 Write config.
Table 1. Parameters
Name Type Default Optional Description

computePoolSelector

String

n/a

no

The selector for the compute pool on which to run the Louvain job.

configuration

Map

{}

no

Configuration for graph project, algorithm compute and result write back.

The configuration map consists of the following three entries.

For more details on below Project configuration, refer to the Project documentation.
Table 2. Project configuration
Name Type

nodeTables

List of node tables.

relationshipTables

Map of relationship types to relationship tables.

Table 3. Compute configuration
Name Type Default Optional Description

mutateProperty

String

'community'

yes

The node property that will be written back to the Snowflake database.

relationshipWeightProperty

String

null

yes

Name of the relationship property to use as weights. If unspecified, the algorithm runs unweighted.

seedProperty

String

n/a

yes

Used to set the initial community for a node. The property value needs to be a non-negative number.

maxLevels

Integer

10

yes

The maximum number of levels in which the graph is clustered and then condensed.

maxIterations

Integer

10

yes

The maximum number of iterations that the modularity optimization will run for each level.

tolerance

Float

0.0001

yes

Minimum change in modularity between iterations. If the modularity changes less than the tolerance value, the result is considered stable and the algorithm returns.

includeIntermediateCommunities

Boolean

false

yes

Indicates whether to write intermediate communities. If set to false, only the final community is persisted.

consecutiveIds

Boolean

false

yes

Flag to decide whether component identifiers are mapped into a consecutive id space (requires additional memory). Cannot be used in combination with the includeIntermediateCommunities flag.

minCommunitySize

Integer

0

yes

Only nodes inside communities larger or equal the given value are returned.

For more details on below Write configuration, refer to the Write documentation.
Table 4. Write configuration
Name Type Default Optional Description

nodeProperty

String

'community'

yes

The node property that will be written back to the Snowflake database.

Examples

In this section we will show examples of running the Louvain community detection algorithm on a concrete graph. The intention is to illustrate what the results look like and to provide a guide in how to make use of the algorithm in a real setting. We will do this on a small social network graph of a handful nodes connected in a particular pattern. The example graph looks like this:

Visualization of the example graph
The following SQL statement will create the example graph tables in the Snowflake database:
CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.USERS (NODEID STRING, SEED INT);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.USERS VALUES
  ('Alice', 42),
  ('Bridget', 42),
  ('Charles', 42),
  ('Doug', NULL),
  ('Mark', NULL),
  ('Michael', NULL);

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.LINKS (SOURCENODEID STRING, TARGETNODEID STRING, WEIGHT FLOAT);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.LINKS VALUES
  ('Alice',   'Bridget', 1),
  ('Alice',   'Charles', 1),
  ('Charles', 'Bridget', 1),
  ('Alice',   'Doug',    5),
  ('Mark',    'Doug',    1),
  ('Mark',    'Michael', 1),
  ('Michael', 'Mark',    1);

This graph has two clusters of Users, that are closely connected. Between those clusters there is one single edge. The relationships that connect the nodes in each component have a property weight which determines the strength of the relationship.

We load the LINK relationships with orientation set to UNDIRECTED as this works best with the Louvain algorithm.

With the node and relationship tables in Snowflake we can now project it as part of an algorithm job. In the following examples we will demonstrate using the Louvain algorithm on this graph.

Run job

Running a Louvain job involves the three steps: Project, Compute and Write.

To run the query, there is a required setup of grants for the application, your consumer role and your environment. Please see the Getting started page for more on this.

We also assume that the application name is the default Neo4j_Graph_Analytics. If you chose a different app name during installation, please replace it with that.

The following will run a Louvain job:
CALL Neo4j_Graph_Analytics.graph.louvain('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS',
                'orientation': 'UNDIRECTED'
            }
        }
    },
    'compute': {
        'mutateProperty': 'community_id'
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY',
        'nodeProperty': 'community_id'
    }]
});

The returned result contains information about the job execution and result distribution. Additionally, the community ID for each of the nodes has been written back to the Snowflake database. We can query it like so:

SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY;
Table 5. Results
NODEID COMMUNITY_ID

Alice

1

Bridget

1

Charles

1

Doug

3

Mark

3

Michael

3

We use default values for the procedure configuration parameter. Levels and innerIterations are set to 10 and the tolerance value is 0.0001.

Weighted

The Louvain algorithm can also run on weighted graphs, taking the given relationship weights into concern when calculating the modularity.

The following will run the algorithm on a weighted graph:
CALL Neo4j_Graph_Analytics.graph.louvain('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'USERS' ],
        'relationshipTables': {
            'LINKS': {
                'sourceTable': 'USERS',
                'targetTable': 'USERS',
                'orientation': 'UNDIRECTED'
            }
        }
    },
    'compute': {
        'mutateProperty': 'community_id',
        'relationshipWeightProperty': 'WEIGHT'
    },
    'write': [{
        'nodeLabel': 'USERS',
        'outputTable': 'USERS_COMMUNITY',
        'nodeProperty': 'community_id'
    }]
});
Table 6. Results
JOB_ID JOB_START JOB_END JOB_RESULT

job_2735686e67ae49e7bf90e3f843652da7

2025-06-27 09:51:10.964

2025-06-27 09:51:15.755

 {
    "louvain_1": {
      "communityCount": 3,
      "communityDistribution": {
        "max": 2,
        "mean": 2,
        "min": 2,
        "p1": 2,
        "p10": 2,
        "p25": 2,
        "p5": 2,
        "p50": 2,
        "p75": 2,
        "p90": 2,
        "p95": 2,
        "p99": 2,
        "p999": 2
      },
      "computeMillis": 109,
      "configuration": {
        "concurrency": 2,
        "consecutiveIds": false,
        "includeIntermediateCommunities": false,
        "jobId": "36a49484-cf53-4c09-b075-7c7314667879",
        "logProgress": true,
        "maxIterations": 10,
        "maxLevels": 10,
        "mutateProperty": "community_id",
        "nodeLabels": ["*"],
        "relationshipTypes": ["*"],
        "relationshipWeightProperty": "WEIGHT",
        "seedProperty": null,
        "sudo": false,
        "tolerance": 1.000000000000000e-04
      },
      "modularities": [0.2933884297520661],
      "modularity": 0.2933884297520661,
      "mutateMillis": 3,
      "nodePropertiesWritten": 6,
      "postProcessingMillis": 75,
      "preProcessingMillis": 9,
      "ranLevels": 1
    },
    "project_1": {
      "graphName": "snowgraph",
      "nodeCount": 6,
      "nodeMillis": 110,
      "relationshipCount": 14,
      "relationshipMillis": 329,
      "totalMillis": 439
    },
    "write_node_property_1": {
      "exportMillis": 1928,
      "nodeLabel": "USERS",
      "nodeProperty": "community_id",
      "outputTable": "EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY",
      "propertiesExported": 6
    }
}
SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.USERS_COMMUNITY;
Table 7. Results
NODEID COMMUNITY_ID

Alice

3

Bridget

2

Charles

2

Doug

3

Mark

5

Michael

5

Using the weighted relationships, we see that Alice and Doug have formed their own community, as their link is much stronger than all the others.

Using intermediate communities

As described before, Louvain is a hierarchical clustering algorithm. That means that after every clustering step, all nodes that belong to the same cluster are reduced to a single node. Relationships between nodes of the same cluster become self-relationships, relationships to nodes of other clusters connect to the clusters representative. This condensed graph is then used to run the next level of clustering. The process is repeated until the clusters are stable.

To demonstrate this iterative behaviour, we need to construct a more complex graph.

louvain multilevel graph
The following SQL statements will create a multi-level graph, encoded in Snowflake tables:
CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.NODES (NODEID STRING);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.NODES VALUES
  ('a'),
  ('b'),
  ('c'),
  ('d'),
  ('e'),
  ('f'),
  ('g'),
  ('h'),
  ('i'),
  ('j'),
  ('k'),
  ('l'),
  ('m'),
  ('n'),
  ('x');

CREATE OR REPLACE TABLE EXAMPLE_DB.DATA_SCHEMA.TYPES (SOURCENODEID STRING, TARGETNODEID STRING);
INSERT INTO EXAMPLE_DB.DATA_SCHEMA.TYPES VALUES
  ('a', 'b'),
  ('a', 'd'),
  ('a', 'f'),
  ('b', 'd'),
  ('b', 'x'),
  ('b', 'g'),
  ('b', 'e'),
  ('c', 'x'),
  ('c', 'f'),
  ('d', 'k'),
  ('e', 'x'),
  ('e', 'f'),
  ('e', 'h'),
  ('f', 'g'),
  ('g', 'h'),
  ('h', 'i'),
  ('h', 'j'),
  ('i', 'k'),
  ('j', 'k'),
  ('j', 'm'),
  ('j', 'n'),
  ('k', 'm'),
  ('k', 'l'),
  ('l', 'n'),
  ('m', 'n');

Now we can see the iterative flow of the algorithm:

The following runs the algorithm and outputs the intermediate communities:
CALL Neo4j_Graph_Analytics.graph.louvain('CPU_X64_XS', {
    'defaultTablePrefix': 'EXAMPLE_DB.DATA_SCHEMA',
    'project': {
        'nodeTables': [ 'NODES' ],
        'relationshipTables': {
            'TYPES': {
                'sourceTable': 'NODES',
                'targetTable': 'NODES',
                'orientation': 'UNDIRECTED'
            }
        }
    },
    'compute': {
        'mutateProperty': 'community_id',
        'includeIntermediateCommunities': true
    },
    'write': [{
        'nodeLabel': 'NODES',
        'outputTable': 'NODES_INTERMEDIATE_COMMUNITY',
        'nodeProperty': 'community_id'
    }]
});
Table 8. Results
JOB_ID JOB_START JOB_END JOB_RESULT

job_3369053998a44e73a651b4769f02ca6a

2025-06-27 11:25:54.526

2025-06-27 11:25:59.379

 {
    "louvain_1": {
      "communityCount": 3,
      "communityDistribution": {
        "max": 7,
        "mean": 5,
        "min": 3,
        "p1": 3,
        "p10": 3,
        "p25": 3,
        "p5": 3,
        "p50": 5,
        "p75": 7,
        "p90": 7,
        "p95": 7,
        "p99": 7,
        "p999": 7
      },
      "computeMillis": 197,
      "configuration": {
        "concurrency": 2,
        "consecutiveIds": false,
        "includeIntermediateCommunities": true,
        "jobId": "a186e31a-6cc9-451a-94f2-8a715a2aad37",
        "logProgress": true,
        "maxIterations": 10,
        "maxLevels": 10,
        "mutateProperty": "intermediate_communities",
        "nodeLabels": ["*"],
        "relationshipTypes": ["*"],
        "seedProperty": null,
        "sudo": false,
        "tolerance": 1.000000000000000e-04
      },
      "modularities": [0.37599999999999995, 0.3816],
      "modularity": 0.3816,
      "mutateMillis": 3,
      "nodePropertiesWritten": 15,
      "postProcessingMillis": 50,
      "preProcessingMillis": 8,
      "ranLevels": 2
    },
    "project_1": {
      "graphName": "snowgraph",
      "nodeCount": 15,
      "nodeMillis": 138,
      "relationshipCount": 50,
      "relationshipMillis": 292,
      "totalMillis": 430
    },
    "write_node_property_1": {
      "exportMillis": 1827,
      "nodeLabel": "NODES",
      "nodeProperty": "intermediate_communities",
      "outputTable": "EXAMPLE_DB.DATA_SCHEMA.NODES_INTERMEDIATE_COMMUNITY",
      "propertiesExported": 15
    }
}
SELECT * FROM EXAMPLE_DB.DATA_SCHEMA.NODES_INTERMEDIATE_COMMUNITY;
Table 9. Results
NODEID INTERMEDIATE_COMMUNITIES

a

[3, 14]

b

[3, 14]

c

[14, 14]

d

[3, 14]

e

[14, 14]

f

[14, 14]

g

[7, 7]

h

[7, 7]

i

[7, 7]

j

[12, 12]

k

[12, 12]

l

[12, 12]

m

[12, 12]

n

[12, 12]

x

[14, 14]

In this example graph, after the first iteration we see 4 clusters, which in the second iteration are reduced to three.