Random generation

In certain use cases it is useful to generate random graphs, for example, for testing or benchmarking purposes. For that reason the Neo4j Graph Algorithm library comes with a set of built-in graph generators. The generator stores the resulting graph in the graph catalog. That graph can be used as input for any algorithm in the library.

This feature is in the beta tier. For more information on feature tiers, see API Tiers.

It is currently not possible to persist these graphs in Neo4j. Running an algorithm in write mode on a generated graph will lead to unexpected results.

The graph generation is parameterized by three dimensions:

  • node count - the number of nodes in the generated graph

  • average degree - describes the average out-degree of the generated nodes

  • relationship distribution function - the probability distribution method used to connect generated nodes

Syntax

The following describes the API for running the graph generation procedure
CALL gds.graph.generate(
    graphName: String,
    nodeCount: Integer,
    averageDegree: Integer,
    configuration: Map
})
YIELD name, nodes, relationships, generateMillis, relationshipSeed, averageDegree, relationshipDistribution, relationshipProperty
Table 1. Parameters
Name Type Default Optional Description

graphName

String

null

no

The name under which the generated graph is stored.

nodeCount

Integer

null

no

The number of generated nodes.

averageDegree

Integer

null

no

The average out-degree of generated nodes.

configuration

Map

{}

yes

Additional configuration, see below.

Table 2. Configuration
Name Type Default Optional Description

relationshipDistribution

String

UNIFORM

yes

The probability distribution method used to connect generated nodes. For more information see Relationship Distribution.

relationshipSeed

Integer

null

yes

The seed used for generating relationships.

relationshipProperty

Map

{}

yes

Describes the method used to generate a relationship property. By default no relationship property is generated. For more information see Relationship Property.

aggregation

String

NONE

yes

The relationship aggregation method cf. Relationship Projection.

orientation

String

NATURAL

yes

The method of orienting edges. Allowed values are NATURAL, REVERSE and UNDIRECTED.

allowSelfLoops

Boolean

false

yes

Whether to allow relationships with identical source and target node.

Table 3. Results
Name Type Description

name

String

The name under which the stored graph was stored.

nodes

Integer

The number of nodes in the graph.

relationships

Integer

The number of relationships in the graph.

generateMillis

Integer

Milliseconds for generating the graph.

relationshipSeed

Integer

The seed used for generating relationships.

averageDegree

Float

The average out degree of the generated nodes.

relationshipDistribution

String

The probability distribution method used to connect generated nodes.

relationshipProperty

String

The configuration of the generated relationship property.

Relationship Distribution

The relationshipDistribution parameter controls the statistical method used for the generation of new relationships. Currently there are three supported methods:

  • UNIFORM - Distributes the outgoing relationships evenly, i.e., every node has exactly the same out degree (equal to the average degree). The target nodes are selected randomly.

  • RANDOM - Distributes the outgoing relationships using a normal distribution with an average of averageDegree and a standard deviation of 2 * averageDegree. The target nodes are selected randomly.

  • POWER_LAW - Distributes the incoming relationships using a power law distribution. The out degree is based on a normal distribution.

Relationship Property

The graph generator is capable of generating a relationship property. This can be controlled using the relationshipProperty parameter which accepts the following parameters:

Table 4. Configuration
Name Type Default Optional Description

name

String

null

no

The name under which the property values are stored.

type

String

null

no

The method used to generate property values.

min

Float

0.0

yes

Minimal value of the generated property (only supported by RANDOM).

max

Float

1.0

yes

Maximum value of the generated property (only supported by RANDOM).

value

Float

null

yes

Fixed value assigned to every relationship (only supported by FIXED).

Currently, there are two supported methods to generate relationship properties:

  • FIXED - Assigns a fixed value to every relationship. The value parameter must be set.

  • RANDOM - Assigns a random value between the lower (min) and upper (max) bound.

Relationship Seed

The relationshipSeed parameter allows the user to specify the seed used to generate the random graph manually. When specified, the procedure will produce the same relationships between nodes regardless of whether the generated graph is going to be created as weighted or unweighted. This can be helpful if one wants to examine the behavior or performance of an algorithm under weight conditions.

Examples

All the examples below should be run in an empty database.

In the following we will demonstrate the usage of the random graph generation procedure.

Generating unweighted graphs

The following will produce a graph with unweighted relationships
CALL gds.graph.generate('graph',5,2, {relationshipSeed:19})
YIELD name, nodes, relationships, relationshipDistribution
Table 5. Results
name nodes relationships relationshipDistribution

"graph"

5

10

"UNIFORM"

A new in-memory graph called graph with 5 nodes and 10 relationships has been created and added to the graph catalog. We can examine its topology with the gds.graph.relationships procedure.

The following will show the produced relationships
CALL gds.graph.relationships.stream('graph')
YIELD sourceNodeId,targetNodeId
RETURN  sourceNodeId as source, targetNodeId as target
ORDER BY source ASC,target ASC
Table 6. Results
source target

0

1

0

2

1

0

1

4

2

1

2

4

3

0

3

1

4

0

4

3

Generating weighted graphs

To generated graphs with weighted relationships we must specify the relationshipProperty parameter as discussed above.

The following will produce a graph with weighted relationships
CALL gds.graph.generate('weightedGraph',5,2, {relationshipSeed:19,
  relationshipProperty: {type: 'RANDOM', min: 5.0, max: 10.0, name: 'score'}})
YIELD name, nodes, relationships, relationshipDistribution
Table 7. Results
name nodes relationships relationshipDistribution

"weightedGraph"

5

10

"UNIFORM"

The produced graph, weightedGraph, has a property named score containing a random value between 5.0 and 10.0 for each relationship. We can use gds.graph.relationshipProperty.stream to stream the relationships of the graph along with their score values.

The following will show the produced relationships
CALL gds.graph.relationshipProperty.stream('weightedGraph','score')
YIELD sourceNodeId, targetNodeId, propertyValue
RETURN  sourceNodeId as source, targetNodeId as target, propertyValue as score
ORDER BY source ASC,target ASC, score
Table 8. Results
source target score

0

1

6.791408433596591

0

2

8.662453313014902

1

0

6.258381821615686

1

4

9.711806397654765

2

1

9.469695236791349

2

4

6.519823445755963

3

0

8.747179900968224

3

1

7.752117836610726

4

0

8.614858979680758

4

3

5.060444167785128

Notice that despite as graph and weightedGraph have been created with the same seed their relationship topology is equivalent.