Collapse Path

This feature is in the beta tier. For more information on feature tiers, see API Tiers.

Introduction

The Collapse Path algorithm is a traversal algorithm capable of creating relationships between the start and end nodes of a traversal. In other words, the path between a start node and an end node is collapsed into a single relationship (a direct path). The algorithm is intended to support the creation of monopartite graphs required by many graph algorithms.

The main input for the algorithm is a list of path templates. Starting from every node in the specified graph, the relationships of each template are traversed one after the other using the order specified in the configuration. Only nodes reached after traversing entire paths are used as end nodes. Exactly one directed relationship is created for every pair of nodes for which at least one path from start to end node exists.

Syntax

Collapse Path syntax per mode

Run Collapse Path in mutate mode on a named graph.

CALL gds.collapsePath.mutate(
  graphName: String,
  configuration: Map
)
YIELD
  preProcessingMillis: Integer,
  computeMillis: Integer,
  mutateMillis: Integer,
  relationshipsWritten: Integer,
  configuration: Map

Table 1. Parameters
Name	Type	Default	Optional	Description
graphName	String	`n/a`	no	The name of a graph stored in the catalog.
configuration	Map	`{}`	yes	Configuration for algorithm-specifics and/or graph filtering.

Table 2. General configuration for algorithm execution on a named graph.
Name	Type	Default	Optional	Description
nodeLabels	List of String	`['*']`	yes	Filter the named graph using the given node labels.
concurrency	Integer	`4 ^[1]`	yes	The number of concurrent threads used for running the algorithm.
1. In a GDS Session the default is the number of available processors

Table 3. Algorithm specific configuration
Name	Type	Default	Optional	Description
pathTemplates	List of List of String	`n/a`	no	A path template is an ordered list of relationship types used for the traversal. The same relationship type can be added multiple times, in order to traverse them as indicated. And, you may specify several path templates to process in one go.
mutateRelationshipType	String	`n/a`	no	Relationship type of the newly created relationships.
allowSelfLoops	Boolean	`false`	yes	Indicates whether it is possible to create self referencing relationships, i.e. relationships where the start and end node are identical.

Table 4. Results
Name	Type	Description
`preProcessingMillis`	Integer	Milliseconds for preprocessing the data.
`computeMillis`	Integer	Milliseconds for running the algorithm.
`mutateMillis`	Integer	Milliseconds for adding properties to the projected graph.
`relationshipsWritten`	Integer	The number of relationships created by the algorithm.
`configuration`	Map	The configuration used for running the algorithm.

Examples

All the examples below should be run in an empty database.

The examples use Cypher projections as the norm. Native projections will be deprecated in a future release.

Consider the graph created by the following Cypher statement:

CREATE
  (Dan:Person),
  (Annie:Person),
  (Matt:Person),
  (Jeff:Person),

  (Guitar:Instrument),
  (Flute:Instrument),

  (Dan)-[:PLAYS]->(Guitar),
  (Annie)-[:PLAYS]->(Guitar),

  (Matt)-[:PLAYS]->(Flute),
  (Jeff)-[:PLAYS]->(Flute)

In this example we want to create a relationship, called PLAYS_SAME_INSTRUMENT, between Person nodes that play the same instrument. To achieve that we have to traverse a path specified by the following Cypher pattern:

(p1:Person)-[:PLAYS]->(:Instrument)-[:PLAYED_BY]->(p2:Person)

In our source graph only the PLAYS relationship type exists. The PLAYED_BY relationship type can be created by loading the PLAYS relationship type in REVERSE direction. The following query will project such a graph:

MATCH (p:Person)-[:PLAYS]->(i:Instrument)
CALL {
  WITH p, i
  RETURN id(p) AS sourceId, id(i) AS targetId, 'PLAYS' AS rType
  UNION
  WITH p, i
  RETURN id(i) AS sourceId, id(p) AS targetId, 'PLAYED_BY' AS rType
}
RETURN gds.graph.project('persons', sourceId, targetId, { relationshipType: rType })

Now we can run the algorithm by specifying the traversal PLAYS, PLAYED_BY in the pathTemplates option.

CALL gds.collapsePath.mutate(
  'persons',
  {
    pathTemplates: [['PLAYS', 'PLAYED_BY']],
    allowSelfLoops: false,
    mutateRelationshipType: 'PLAYS_SAME_INSTRUMENT'
  }
) YIELD relationshipsWritten

Table 5. Results
relationshipsWritten
4

The mutated graph will look like the following graph when filtered by the PLAYS_SAME_INSTRUMENT relationship

CREATE
  (Dan:Person),
  (Annie:Person),
  (Matt:Person),
  (Jeff:Person),

  (Guitar:Instrument),
  (Flute:Instrument),

  (Dan)-[:PLAYS_SAME_INSTRUMENT]->(Annie),
  (Annie)-[:PLAYS_SAME_INSTRUMENT]->(Dan),

  (Matt)-[:PLAYS_SAME_INSTRUMENT]->(Jeff),
  (Jeff)-[:PLAYS_SAME_INSTRUMENT]->(Matt)