# Cypher projection

 This is documentation for the Graph Algorithms Library, which has been deprecated by the Graph Data Science Library (GDS).

If the node-label and relationship-type projection is not selective enough to describe our subgraph to run the algorithm on, we can use Cypher statements to project subsets of our graph. Use a node-statement instead of the label parameter and a relationship-statement instead of the relationship type, and use `graph:'cypher'` in the config.

Relationships described in the relationship-statement will only be projected if both source and target nodes are described in the node-statement, otherwise they will be ignored.

Cypher projection enables us to be more expressive in describing our subgraph that we want to analyse, but might take longer to project the graph with more complex cypher queries.

The general call syntax is:

``````CALL algo.<name>(
'MATCH (n) RETURN id(n) AS id',
"MATCH (n)-->(m) RETURN id(n) AS source, id(m) AS target",
{graph: "cypher"})``````
• The first query `MATCH (n) RETURN id(n) AS id` returns node ids. The Cypher loader expects the query to return an `id` field.

• The second query `MATCH (n)-→(m) RETURN id(n) AS source, id(m) AS target` returns pairs of node ids that have a relationship between them in our projected graph. The Cypher loader expects the query to return `source` and `target` fields.

Note that in both queries we use the `id` function to return the node id.

## 1. Weights

We can also return a property value or weight (according to our config) in addition to the ids from these statements. We do this by returning an optional `weight` field.

The following will run the algortihm over a graph based on Cypher projections for nodes and relationships, using the `score` property of each relationship as weight:
``````CALL algo.<name>(
'MATCH (n) RETURN id(n) AS id',
"MATCH (n)-[r]->(m) RETURN id(n) AS source, id(m) AS target, r.score AS weight",
{graph: "cypher"})``````

## 2. Parameters

We can pass through parameters into our Cypher projection queries via the `params` config option.

The following will run the algorithm over a graph based on Cypher projections that use the `\$firstName` and `\$lastName` parameters:
``````CALL algo.<name>(
'MATCH (n)
WHERE n.name CONTAINS \$firstName
RETURN id(n) AS id',
"MATCH (n)-->(m)
WHERE n.name CONTAINS \$lastName OR m.name CONTAINS \$lastName
RETURN id(n) AS source, id(m) AS target",
{graph: "cypher", params: {firstName: \$firstName, lastName: \$lastName} })``````

## 3. Example Usage

We could use Cypher projections to run the PageRank algorithm on DBpedia, as shown in the following examples.

The following will run the PageRank algorithm over a graph based on Cypher projections for nodes and relationships:
``````CALL algo.pageRank(
'MATCH (p:Page) RETURN id(p) as id',
'MATCH (p1:Page)-[:Link]->(p2:Page) RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:5, write: true});``````

Cypher projection can also be used to project a virtual (non-stored) graph. Here is an example of how to project an undirected graph of people who visited the same web page and run the Louvain community detection algorithm on it, using the number of common visited web pages between pairs of people as relationship weight:

The following will run the Louvain algortihm over a graph based on Cypher projections for nodes and relationships:
``````CALL algo.louvain(
'MATCH (p:Person) RETURN id(p) as id',
'MATCH (p1:Person)-[:VISIT]->(:Page)<-[:VISIT]-(p2:Person)
RETURN id(p1) as source, id(p2) as target, count(*) as weight',
{graph:'cypher', iterations:5, write: true});``````

By default, Cypher projection queries are run on a single thread.

We can parallelize loading by including `SKIP` clause and `\$skip` parameter, as well as `LIMIT` clause and `\$limit` parameter in our projection queries. Parallelization of the queries is based on the `batchSize` key in the config, which has a default value of 10,000.

 If the parameters are not named `\$skip` and `\$limit`, they will be ignored, and the sequential loading approach will be used.
The following runs PageRank over a graph based on Cypher projections, with nodes loaded in parallel:
``````CALL algo.pageRank(
'MATCH (p:Page) WITH p SKIP \$skip LIMIT \$limit RETURN id(p) as id',
'MATCH (p1:Page)-[:Link]->(p2:Page) RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:5, write: true});``````
The following runs PageRank over a graph based on Cypher projections, with relationships loaded in parallel:
``````CALL algo.pageRank(
'MATCH (p:Page) RETURN id(p) as id',
'MATCH (p1:Page)-[:Link]->(p2:Page) WITH * SKIP \$skip LIMIT \$limit RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:5, write: true});``````
The following runs PageRank over a graph based on Cypher projections, with nodes and relationships loaded in parallel:
``````CALL algo.pageRank(
'MATCH (p:Page) WITH p SKIP \$skip LIMIT \$limit RETURN id(p) as id',
'MATCH (p1:Page)-[:Link]->(p2:Page) WITH * SKIP \$skip LIMIT \$limit RETURN id(p1) as source, id(p2) as target',
{graph:'cypher', iterations:5, write: true});``````

If node label and relationship type are not selective enough to describe your subgraph to run the algorithm on, you can use Cypher statements to load or project subsets of your graph. This can also be used to run algorithms on a virtual graph. You can learn more in the Cypher projection section of the manual.

If the similarity lists are very large they can take up a lot of memory. For cases where those lists contain lots of values that should be skipped, you can use the less memory-intensive approach of using Cypher statements to project the graph instead.

• `item` - should contain node ids, which we can return using the `id` function.
• `category` - should contain node ids, which we can return using the `id` function.
• `weight` - should contain a double value.