Updating the data

Earlier you learned how to represent nodes, relationships, labels, properties, and patterns in Cypher®. This section adds another level to your knowledge by introducing how to update and delete data with Cypher.

While these are the standard CRUD (create, update, and delete) operations, some things function a bit differently in a graph than in other types of databases. You will probably recognize some of the similarities and differences as we go along.

Updating data with Cypher

You may already have a node or relationship in the data, but you want to modify its properties. You can do this by matching the pattern you want to find and using the SET keyword to add, remove, or update properties.

We continue to use the following dataset:

To create the aforementioned graph, run the Cypher query:

CREATE (diana:Person {name: "Diana"})
CREATE (melissa:Person {name: "Melissa", twitter: "@melissa"})
CREATE (dan:Person {name: "Dan", twitter: "@dan", yearsExperience: 6})
CREATE (sally:Person {name: "Sally", yearsExperience: 4})
CREATE (john:Person {name: "John", yearsExperience: 5})
CREATE (jennifer:Person {name: "Jennifer", twitter: "@jennifer", yearsExperience: 5})
CREATE (joe:Person {name: "Joe"})
CREATE (mark:Person {name: "Mark", twitter: "@mark"})
CREATE (ann:Person {name: "Ann"})
CREATE (xyz:Company {name: "XYZ"})
CREATE (x:Company {name: "Company X"})
CREATE (a:Company {name: "Company A"})
CREATE (Neo4j:Company {name: "Neo4j"})
CREATE (abc:Company {name: "ABC"})
CREATE (query:Technology {type: "Query Languages"})
CREATE (etl:Technology {type: "Data ETL"})
CREATE (integrations:Technology {type: "Integrations"})
CREATE (graphs:Technology {type: "Graphs"})
CREATE (dev:Technology {type: "Application Development"})
CREATE (java:Technology {type: "Java"})
CREATE (diana)-[:LIKES]->(query)
CREATE (melissa)-[:LIKES]->(query)
CREATE (dan)-[:LIKES]->(etl)<-[:LIKES]-(melissa)
CREATE (xyz)<-[:WORKS_FOR]-(sally)-[:LIKES]->(integrations)<-[:LIKES]-(dan)
CREATE (sally)<-[:IS_FRIENDS_WITH]-(john)-[:LIKES]->(java)
CREATE (john)<-[:IS_FRIENDS_WITH]-(jennifer)-[:LIKES]->(java)
CREATE (john)-[:WORKS_FOR]->(xyz)
CREATE (sally)<-[:IS_FRIENDS_WITH]-(jennifer)-[:IS_FRIENDS_WITH]->(melissa)
CREATE (joe)-[:LIKES]->(query)
CREATE (x)<-[:WORKS_FOR]-(diana)<-[:IS_FRIENDS_WITH]-(joe)-[:IS_FRIENDS_WITH]->(mark)-[:LIKES]->(graphs)<-[:LIKES]-(jennifer)-[:WORKS_FOR]->(Neo4j)
CREATE (ann)<-[:IS_FRIENDS_WITH]-(jennifer)-[:IS_FRIENDS_WITH]->(mark)
CREATE (john)-[:LIKES]->(dev)<-[:LIKES]-(ann)-[:IS_FRIENDS_WITH]->(dan)-[:WORKS_FOR]->(abc)
CREATE (ann)-[:WORKS_FOR]->(abc)
CREATE (a)<-[:WORKS_FOR]-(melissa)-[:LIKES]->(graphs)<-[:LIKES]-(diana)

Using the above example dataset thus far, you could update Jennifer’s node to add her birthdate. The next Cypher statement shows how to do this.

  1. First, you need to find the existing node for Jennifer.

  2. Next, use SET to create the new property (with syntax variable.property) and set its value.

  3. Finally, you can return Jennifer’s node to ensure that the information was updated correctly.

MATCH (p:Person {name: 'Jennifer'})
SET p.birthdate = date('1980-01-01')
RETURN p

Query result:

Set Properties: 1
Rows: 1

+------------------------------------------------------+
| p                                                    |
+------------------------------------------------------+
|(Person: {birthdate: '1980-01-01', name: 'Jennifer'}) |
+------------------------------------------------------+

For more information on using date() and other temporal functions, visit the Cypher manual → Temporal functions.

If you want to change Jennifer’s birthdate, you can use the same query above to find Jennifer’s node again and put a different date in the SET clause.

You are able also update Jennifer’s WORKS_FOR relationship with the Company node to include the year that she started working there. To do this, you can use similar syntax as above for updating nodes.

MATCH (:Person {name: 'Jennifer'})-[rel:WORKS_FOR]-(:Company {name: 'Neo4j'})
SET rel.startYear = date({year: 2018})
RETURN rel

Query result:

Set Properties: 1
Rows: 1

+-----------------------------------------+
| rel                                     |
+-----------------------------------------+
| [:WORKS_FOR {startYear: '2018-01-01'}]  |
+-----------------------------------------+

If you want to return a graph view on the above query, you can add variables to the nodes for p:Person and c:Company and write the return line as RETURN p, rel, c.

Deleting data with Cypher

Another operation to cover is how to delete data in Cypher. For this operation, Cypher uses the DELETE keyword for deleting nodes and relationships. It is very similar to deleting data in other languages like SQL, with one exception.

Because Neo4j is ACID-compliant, you cannot delete a node if it still has relationships. If you could do that, then you might end up with a relationship pointing to nothing and an incomplete graph.

Delete a relationship

To delete a relationship, you need to find the start and end nodes for the relationship you want to delete and then use the DELETE keyword, as shown in the code block below. Let us go ahead and delete the IS_FRIENDS_WITH relationship between Jennifer and Mark for now. We will add this relationship back in a later exercise.

MATCH (j:Person {name: 'Jennifer'})-[r:IS_FRIENDS_WITH]->(m:Person {name: 'Mark'})
DELETE r

Query result:

+-----------------------------------------+
| Deleted Relationships: 1                |
| Rows: 0                                 |
+-----------------------------------------+

Delete a node

To delete a node that does not have any relationships, you need to find the node you want to delete and then use the DELETE keyword, just as you did for the relationship above. You can delete Mark’s node for now and bring him back later.

MATCH (m:Person {name: 'Mark'})
DELETE m

Query result:

+-----------------------------------------+
| Deleted Nodes: 1                        |
| Rows: 0                                 |
+-----------------------------------------+

If you have created an empty node by mistake and you need to delete it, you can use the following Cypher statement to do it:

MATCH (n)
WHERE id(n) = 5
DETACH DELETE n

This statement deletes not only a node but also all relationships it has. To run the statement, you should know a node’s internal ID.

Delete a node and its relationship

Instead of running the last two queries to delete the IS_FRIENDS_WITH relationship and the Person node for Mark, you can actually run a single statement to delete the node and its relationship at the same time. As it was mentioned above, Neo4j is ACID-compliant so it doesn’t allow to delete a node if it still has relationships. Using the DETACH DELETE syntax tells Cypher to delete any relationships the node has, as well as remove the node itself.

The statement would look like the code below. First, you find Mark’s node in the database. Then, the DETACH DELETE line removes any existing relationships Mark node has before also deleting the node.

MATCH (m:Person {name: 'Mark'})
DETACH DELETE m

Delete properties

You can also remove properties, but instead of using the DELETE keyword, you can use a couple of other approaches.

The first option is to use REMOVE on the property. This tells Neo4j that you want to remove the property from the node entirely and no longer store it.

The second option is to use the SET keyword from earlier to set the property value to null. Unlike other database models, Neo4j does not store null values. Instead, it only stores properties and values that are meaningful to your data. This means that you can have different types and amounts of properties on various nodes and relationships in your graph.

To show you both options, let us look at the code for each.

//delete property using REMOVE keyword
MATCH (n:Person {name: 'Jennifer'})
REMOVE n.birthdate

//delete property with SET to null value
MATCH (n:Person {name: 'Jennifer'})
SET n.birthdate = null

Query result:

+-----------------------------------------+
| Set Properties: 1                       |
| Rows: 0                                 |
+-----------------------------------------+

Avoiding duplicate data using MERGE

It was briefly mentioned earlier that there are some ways in Cypher to avoid creating duplicate data. One of those ways is using the MERGE keyword. MERGE does a "select-or-insert" operation that first checks if the data exists in the database. If it exists, then Cypher returns it as is or makes any updates you specify on the existing node or relationship. If the data does not exist, then Cypher will create it with the information you specify.

Using MERGE on a node

To start, let us look at an example of this by adding Mark back to our database using the query below. You can use MERGE to ensure that Cypher checks the database for an existing node for Mark. Since you removed Mark’s node in the previous examples, Cypher will not find an existing match and will create the node new with the name property set to 'Mark'.

MERGE (mark:Person {name: 'Mark'})
RETURN mark

Query result:

If you run the same statement again, Cypher will find an existing node this time that has the name property set to Mark, so it will return the matched node without any changes.

Using MERGE on a relationship

Just like you use MERGE to find or create a node in Cypher, you can do the same thing to find or create a relationship. Let’s re-create the IS_FRIENDS_WITH relationship between Mark and Jennifer that we had in a previous example.

MATCH (j:Person {name: 'Jennifer'})
MATCH (m:Person {name: 'Mark'})
MERGE (j)-[r:IS_FRIENDS_WITH]->(m)
RETURN j, r, m

Notice that here MATCH is used to find both Mark’s node and Jennifer’s node before we use MERGE to find or create the relationship between them.

Why do we not use a single statement?

MERGE looks for an entire pattern that you specify to see whether to return an existing one or create it new. If the entire pattern (nodes, relationships, and any specified properties) does not exist, Cypher creates it.

Cypher never produces a partial mix of matching and creating within a pattern. To avoid a mix of match and create, you need to match any existing elements of your pattern first before doing a merge on any elements you might want to create, just as we did in the statement above.

Just for reference, the Cypher statement that causes duplicates is below. Since this pattern (Jennifer IS_FRIENDS_WITH Mark) does not exist in the database, Cypher creates the entire pattern new — both nodes, as well as the relationship between them.

//this statement will create duplicate nodes for Mark and Jennifer
MERGE (j:Person {name: 'Jennifer'})-[r:IS_FRIENDS_WITH]->(m:Person {name: 'Mark'})
RETURN j, r, m

Handling MERGE criteria

Perhaps you want to use MERGE to ensure you do not create duplicates, but you want to initialize certain properties if the pattern is created and update other properties if it is only matched. In this case, you can use ON CREATE or ON MATCH with the SET keyword to handle these situations.

Let us look at an example.

MERGE (m:Person {name: 'Mark'})-[r:IS_FRIENDS_WITH]-(j:Person {name:'Jennifer'})
  ON CREATE SET r.since = date('2018-03-01')
  ON MATCH SET r.updated = date()
RETURN m, r, j