Creating Nodes and Relationships

About this module

You have learned how to query a graph using a number of Cypher clauses and keywords that start with the MATCH clause. You learned how to retrieve data based upon label values, property key values, and relationship types, and how to perform some useful intermediate processing during a query to control what data is returned.

At the end of this module, you will write Cypher statements to:

  • Create a node

    • Add and remove node labels

    • Add and remove node properties

    • Update properties

  • Create a relationship

    • Add and remove properties for a relationship

  • Delete a node

  • Delete a relationship

  • Merge data in a graph

    • Creating nodes

    • Creating relationships

Because the code examples in this lesson modify the database, it is recommended that you do not execute them against your database as you will be doing so in the hands-on exercises.

Creating nodes

Recall that a node is an element of a graph representing a domain entity that has zero or more labels, properties, and relationships to or from other nodes in the graph.

When you create a node, you can add it to the graph without connecting it to another node.

Here is the simplified syntax for creating a node:

CREATE (optionalVariable optionalLabels {optionalProperties})

If you plan on referencing the newly created node, you must provide a variable. Whether you provide labels or properties at node creation time is optional. In most cases, you will want to provide some label and property values for a node when created. This will enable you to later retrieve the node. Provided you have a reference to the node (for example, using a MATCH clause), you can always add, update, or remove labels and properties at a later time.

Here are some examples of creating a single node in Cypher:

Add a node to the graph of type Movie with the title Batman Begins. This node can be retrieved using the title. A set of nodes with the label Movie can also be retrieved which will contain this node:

CREATE (:Movie {title: 'Batman Begins'})

Add a node with two labels to the graph of types Movie and Action with the title Batman Begins. This node can be retrieved using the title. A set of nodes with the labels Movie or Action can also be retrieved which will contain this node:

CREATE (:Movie:Action {title: 'Batman Begins'})

Add a node to the graph of types Movie and Action with the title Batman Begins. This node can be retrieved using the title. A set of nodes with the labels Movie or Action can also be retrieved which will contain this node. The variable m can be used for later processing after the CREATE clause:

CREATE (m:Movie:Action {title: 'Batman Begins'})

Add a node to the graph of types Movie and Action with the title Batman Begins. This node can be retrieved using the title. A set of nodes with the labels Movie or Action can also be retrieved which will contain this node. Here we return the title of the node:

CREATE (m:Movie:Action {title: ' Batman Begins'})
RETURN m.title

Here is what you see in Neo4j Browser when you create a node for the movie, Batman Begins where we have selected the node returned to view its properties.:

CreateBatmanBegins

When the graph engine creates a node, it automatically assigns a read-only, unique ID to the node. Here we see that the id of the node is 568. This is not a property of a node, but rather an internal value.

After you have created a node, you can add more properties or labels to it and most importantly, connect it to another node.

Creating multiple nodes

You can create multiple nodes by simply separating the nodes specified with commas, or by specifying multiple CREATE statements.

Here is an example, where we create some Person nodes that will represent some of the people associated with the movie Batman Begins:

CREATE
(:Person {name: 'Michael Caine', born: 1933}),
(:Person {name: 'Liam Neeson', born: 1952}),
(:Person {name: 'Katie Holmes', born: 1978}),
(:Person {name: 'Benjamin Melniker', born: 1913})

Here is the result of running this Cypher statement:

CreateMultiplePersonNodes

The graph engine will create a node with the same properties of a node that already exists. You can prevent this from happening in one of two ways:

  1. You can use MERGE rather than CREATE when creating the node.

  2. You can add constraints to your graph.

You will learn about merging data later in this module. Constraints are configured globally for a graph and are covered later in this training.

Adding labels to a node

You may not know ahead of time what label or labels you want for a node when it is created. You can add labels to a node using the SET clause.

Here is the simplified syntax for adding labels to a node:

SET x:Label         // adding one label to node referenced by the variable x
SET x:Label1:Label2	// adding two labels to node referenced by the variable x

If you attempt to add a label to a node for which the label already exists, the SET processing is ignored.

Here is an example where we add the Action label to the node that has a label, Movie:

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m:Action
RETURN labels(m)

Assuming that we have previously created the node for the movie, here is the result of running this Cypher statement:

SetActionLabel

Notice here that we call the built-in function, labels() that returns the set of labels for the node.

Removing labels from a node

Perhaps your data model has changed or the underlying data for a node has changed so that the label for a node is no longer useful or valid.

Here is the simplified syntax for removing labels from a node:

REMOVE x:Label	  // remove the label from the node referenced by the variable x

If you attempt to remove a label from a node for which the label does not exist, the SET processing is ignored.

Here is an example where we remove the Action label from the node that has a labels, Movie and Action:

MATCH (m:Movie:Action)
WHERE m.title = 'Batman Begins'
REMOVE m:Action
RETURN labels(m)

Assuming that we have previously created the node for the movie, here is the result of running this Cypher statement:

RemoveActionLabel

Adding properties to a node

After you have created a node and have a reference to the node, you can add properties to the node, again using the SET keyword.

Here are simplified syntax examples for adding properties to a node referenced by the variable x:

SET x.propertyName = value
SET x.propertyName1 = value1	, x.propertyName2 = value2
SET x = {propertyName1: value1, propertyName2: value2}
SET x += {propertyName1: value1, propertyName2: value2}

If the property does not exist, it is added to the node. If the property exists, its value is updated. If the value specified is null, the property is removed.

Note that the type of data for a property is not enforced. That is, you can assign a string value to a property that was once a numeric value and visa versa.

When you specify the JSON-style object for assignment (using =) of the property values for the node, the object must include all of the properties and their values for the node as the existing properties for the node are overwritten. However, if you specify += when assigning to a property, the value at valueX is updated if the propertyNnameX exists for the node. If the propertyNameX does not exist for the node, then the property is added to the node.

Here is an example where we add the properties released and lengthInMinutes to the movie Batman Begins:

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.released = 2005, m.lengthInMinutes = 140
RETURN m

Assuming that we have previously created the node for the movie, here is the result of running this Cypher statement:

AddReleasedMinutesProperties

Here is another example where we set the property values to the movie node using the JSON-style object containing the property keys and values. Note that all properties must be included in the object.

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET  m = {title: 'Batman Begins',
          released: 2005,
          lengthInMinutes: 140,
          videoFormat: 'DVD',
          grossMillions: 206.5}
RETURN m

Here is the result of running this Cypher statement:

SetPropertiesObject

Note that when you add a property to a node for the first time in the graph, the property key is added to the graph. So for example, in the previous example, we added the videoFormat and grossMillions property keys to the graph as they have never been used before for a node in the graph. Once a property key is added to the graph, it is never removed. When you examine the property keys in the database (by executing CALL db.propertyKeys(), you will see all property keys created for the graph, regardless of whether they are currently used for nodes and relationships.

AllPropertyKeys

Here is an example where we use the JSON-style object to add the awards property to the node and update the grossMillions property:

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET  m += { grossMillions: 300,
            awards: 66}
RETURN m

Here is the result:

AddAndUpdateProperties

Removing properties from a node

There are two ways that you can remove a property from a node. One way is to use the REMOVE keyword. The other way is to set the property’s value to null.

Here are simplified syntax examples for removing properties from a node referenced by the variable x:

REMOVE x.propertyName
SET x.propertyName = null

Suppose we determined that no other Movie node in the graph has the properties, videoFormat and grossMillions. There is no restriction that nodes of the same type must have the same properties. However, we have decided that we want to remove these properties from this node. Here is an example Cypher to remove this property from this Batman Begins node:

MATCH (m:Movie)
WHERE m.title = 'Batman Begins'
SET m.grossMillions = null
REMOVE m.videoFormat
RETURN m

Assuming that we have previously created the node for the movie with the these properties, here is the result of running this Cypher statement where we remove each property a different way. One way we remove the property using the SET clause is to set the property to null. And in another way, we use the REMOVE clause.

RemoveProperties

Exercise 8: Creating Nodes

In the query edit pane of Neo4j Browser, execute the browser command:

:play intro-neo4j-exercises

and follow the instructions for Exercise 8.

Creating relationships

As you have learned in the previous exercises where you query the graph, you often query using connections between nodes. The connections capture the semantic relationships and context of the nodes in the graph.

Here is the simplified syntax for creating a relationship between two nodes referenced by the variables x and y:

CREATE (x)-[:REL_TYPE]->(y)
CREATE (x)<-[:REL_TYPE]-(y)

When you create the relationship, it must have direction. You can query nodes for a relationship in either direction, but you must create the relationship with a direction. An exception to this is when you create a node using MERGE that you will learn about later in this module.

In most cases, unless you are connecting nodes at creation time, you will retrieve the two nodes, each with their own variables, for example, by specifying a WHERE clause to find them, and then use the variables to connect them.

Here is an example. We want to connect the actor, Michael Caine with the movie, Batman Begins. We first retrieve the nodes of interest, then we create the relationship:

MATCH (a:Person), (m:Movie)
WHERE a.name = 'Michael Caine' AND m.title = 'Batman Begins'
CREATE (a)-[:ACTED_IN]->(m)
RETURN a, m

Here is the result of running this Cypher statement:

CreateActedInRelationship
Before you run these Cypher statements, you may see a warning in Neo4j Browser that you are creating a query that is a cartesian product that could potentially be a performance issue. You will see this warning if you have no unique constraint on the lookup keys. You will learn about uniqueness constraints later in the next module. If you are familiar with the data in the graph and can be sure that the MATCH clauses will not retrieve large amounts of data, you can continue. In our case, we are simply looking up a particular Person node and a particular Movie node so we can create the relationship.

You can create multiple relationships at once by simply providing the pattern for the creation that includes the relationship types, their directions, and the nodes that you want to connect.

Here is an example where we have already created Person nodes for an actor, Liam Neeson, and a producer, Benjamin Melniker. We create two relationships in this example, one for ACTED_IN and one for PRODUCED.

MATCH (a:Person), (m:Movie), (p:Person)
WHERE a.name = 'Liam Neeson' AND
      m.title = 'Batman Begins' AND
      p.name = 'Benjamin Melniker'
CREATE (a)-[:ACTED_IN]->(m)<-[:PRODUCED]-(p)
RETURN a, m, p

Here is the result of running this Cypher statement:

CreateTwoRelationships
When you create relationships based upon a MATCH clause, you must be certain that only a single node is returned for the MATCH, otherwise multiple relationships will be created.

Adding properties to relationships

You can add properties to a relationship, just as you add properties to a node. You use the SET clause to do so.

Here is the simplified syntax for adding properties to a relationship referenced by the variable r:

SET r.propertyName = value
SET r.propertyName1 = value1	, r.propertyName2 = value2
SET r = {propertyName1: value1, propertyName2: value2}
SET r += {propertyName1: value1, propertyName2: value2}

If the property does not exist, it is added to the relationship. If the property exists, its value is updated for the relationship. When you specify the JSON-style object for assignment to the relationship using =, the object must include all of the properties for the relationship, just as you need to do for nodes. If you use +=, you can add or update properties, just as you do for nodes.

Here is an example where we will add the roles property to the ACTED_IN relationship from Christian Bale to Batman Begins right after we have created the relationship:

MATCH (a:Person), (m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
CREATE (a)-[rel:ACTED_IN]->(m)
SET rel.roles = ['Bruce Wayne','Batman']
RETURN a, m

Here is the result of running this Cypher statement:

AddRelationshipWithRoles

The roles property is a list so we add it as such. If the relationship had multiple properties, we could have added them as a comma separated list or as an object, like you can do for node properties.

You can also add properties to a relationship when the relationship is created. Here is another way to create and add the properties for the relationship:

MATCH (a:Person), (m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
CREATE (a)-[:ACTED_IN {roles: ['Bruce Wayne', 'Batman']}]->(m)
RETURN a, m

By default, the graph engine will create a relationship between two nodes, even if one already exists. You can test to see if the relationship exists before you create it as follows:

MATCH (a:Person),(m:Movie)
WHERE a.name = 'Christian Bale' AND
      m.title = 'Batman Begins' AND
      NOT exists((a)-[:ACTED_IN]->(m))
CREATE (a)-[rel:ACTED_IN]->(m)
SET rel.roles = ['Bruce Wayne','Batman']
RETURN a, rel, m
You can prevent duplication of relationships by merging data using the MERGE clause, rather than the CREATE clause. You will learn about merging data later in this module.

Removing properties from a relationship

There are two ways that you can remove a property from a node. One way is to use the REMOVE keyword. The other way is to set the property’s value to null, just as you do for properties of nodes.

Suppose we have added the ACTED_IN relationship between Christian Bale and the movie, Batman Returns where the roles property is added to the relationship. Here is an example to remove the roles property, yet keep the ACTED_IN relationship:

MATCH (a:Person)-[rel:ACTED_IN]->(m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
REMOVE rel.roles
RETURN a, rel, m

Here is the result returned. An alternative to REMOVE rel.roles would be SET rel.roles = null

RemoveRoles

Exercise 9: Creating Relationships

In the query edit pane of Neo4j Browser, execute the browser command:

:play intro-neo4j-exercises

and follow the instructions for Exercise 9.

Deleting nodes and relationships

If a node has no relationships to any other nodes, you can simply delete it from the graph using the DELETE clause. Relationships are also deleted using the DELETE clause.

If you attempt to delete a node in the graph that has relationships in or out of the node, the graph engine will return an error because deleting such a node will leave orphaned relationships in the graph.

Deleting relationships

Here are the existing nodes and relationships for the Batman Begins movie:

BatmanBeginsRelationships

You can delete a relationship between nodes by first finding it in the graph and then deleting it.

In this example, we want to delete the ACTED_IN relationship between Christian Bale and the movie Batman Begins. We find the relationship, and then delete it:

MATCH (a:Person)-[rel:ACTED_IN]->(m:Movie)
WHERE a.name = 'Christian Bale' AND m.title = 'Batman Begins'
DELETE rel
RETURN a, m

Here is the result of running this Cypher statement:

DeleteRelationship

Notice that there no longer exists the relationship between Christian Bale and the movie Batman Begins.

We can now query the nodes related to Batman Begins to see that this movie now only has two actors and one producer connected to it:

BatmanBeginsRelationships2

Even though we have deleted the relationship between actor, Christian Bale and the movie Batman Begins, we note that this actor is connected to another movie in the graph, so we will not delete this Christian Bale node.

ChristianBaleConnections

In this example, we find the node for the producer, Benjamin Melniker, as well as his relationship to movie nodes. First, we delete the relationship(s), then we delete the node:

MATCH (p:Person)-[rel:PRODUCED]->(:Movie)
WHERE p.name = 'Benjamin Melniker'
DELETE rel, p

Here is the result of running this Cypher statement:

DeleteMelniker

And here we see that we now have only two connections to the Batman Begins movie:

BatmanBeginsRelationships3

Deleting nodes and relationships

The most efficient way to delete a node and its corresponding relationships is to specify DETACH DELETE. When you specify DETACH DELETE for a node, the relationships to and from the node are deleted, then the node is deleted.

If we were to attempt to delete the Liam Neeson node without first deleting its relationships:

MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DELETE p

We would see this error:

LiamNeesonDeleteError

Here we delete the Liam Neeson node and its relationships to any other nodes:

MATCH (p:Person)
WHERE p.name = 'Liam Neeson'
DETACH DELETE  p

Here is the result of running this Cypher statement:

DeleteLiamNeeson

And here is what the Batman Begins node and its relationships now look like. There is only one actor, Michael Caine connected to the movie.

BatmanBeginsRelationships4

Exercise 10: Deleting Nodes and Relationships

In the query edit pane of Neo4j Browser, execute the browser command:

:play intro-neo4j-exercises

and follow the instructions for Exercise 10.

Merging data in the graph

Thus far, you have learned how to create nodes, labels, properties, and relationships in the graph. You can use MERGE to either create new nodes and relationships or to make structural changes to existing nodes and relationships.

For example, how the graph engine behaves when a duplicate element is created depends on the type of element:

If you use CREATE:

The result is:

Node

If a node with the same property values exists, a duplicate node is created.

Label

If the label already exists for the node, the node is not updated.

Property

If the node or relationship property already exists, it is updated with the new value.

Note: If you specify a set of properties to be created using = rather than +=, existing properties are removed if they are not included in the set.

Relationship

If the relationship exists, a duplicate relationship is created.

You must never create duplicate nodes or relationships in a graph.

The MERGE clause is used to find elements in the graph. If the element is not found, it is created.

You use the MERGE clause to:

  • Create a unique node based on label and key information for a property and if it exists, optionally update it.

  • Create a unique relationship.

  • Create a node and relationship to it uniquely in the context of another node.

Using MERGE to create nodes

Here is the simplified syntax for the MERGE clause for creating a node:

MERGE (variable:Label{nodeProperties})
RETURN variable

If there is an existing node with Label and nodeProperties found in the graph, no node is created. If, however the node is not found in the graph, then the node is created.

When you specify nodeProperties for MERGE, only use properties that satisfy some sort of uniqueness constraint. You will learn about uniqueness constraints in the next module.

Here is what we currently have in the graph for the Person, Michael Caine. This node has values for name and born. Notice also that the label for the node is Person.

MichaelCaine

Here we use MERGE to find a node with the Actor label with the key property name of Michael Caine, and we set the born property to 1933. Our data model has never used the label, Actor so this is a new entity type in our graph.

MERGE (a:Actor {name: 'Michael Caine'})
SET a.born = 1933
RETURN a

Here is the result of running this Cypher example. We do not find a node with the label Actor so the graph engine creates one.

MergeActorMichaelCaine
When you specify the node to merge, only use properties that have a unique index. You will learn about uniqueness later in this training.

If we were to repeat this MERGE clause, no additional Actor nodes would be created in the graph.

At this point, however, we have two Michael Caine nodes in the graph, one of type Person, and one of type Actor:

TwoMichaelCaines
Be mindful that node labels and the properties for a node are significant when merging nodes.

Using MERGE to create relationships

Here is the syntax for the MERGE clause for creating relationships:

MERGE (variable:Label {nodeProperties})-[:REL_TYPE]->(otherNode)
RETURN variable

If there is an existing node with Label and nodeProperties with the :REL_TYPE to otherNode found in the graph, no relationship is created. If the relationship does not exist, it is created.

Although, you can leave out the direction of the relationship being created with the MERGE, in which case a left-to-right arrow will be assumed, a best practice is to always specify the direction of the relationship. However, if you have bidirectional relationships and you want to avoid creating duplicate relationships, you must leave off the arrow.

Specifying creation behavior when merging

You can use the MERGE clause, along with ON CREATE to assign specific values to a node being created as a result of an attempt to merge.

Here is an example of creating a new node while specifying property values for the new node:

MERGE (a:Person {name: 'Sir Michael Caine'})
ON CREATE SET a.birthPlace = 'London',
              a.born = 1934
RETURN a

We know that there are no existing Sir Michael Caine Person nodes. When the MERGE executes, it will not find any matching nodes so it will create one and will execute the ON CREATE clause where we set the birthplace and born property values.

Here are the resulting nodes that have anything to do with Michael Caine. The most recently created node has the name value of Sir Michael Caine.

AllMichaelCaines

You can also specify an ON MATCH clause during merge processing. If the exact node is found, you can update its properties or labels. Here is an example:

MERGE (a:Person {name: 'Sir Michael Caine'})
ON CREATE SET a.born = 1934,
              a.birthPlace = 'UK'
ON MATCH SET a.birthPlace = 'UK'
RETURN a

And here we see that the found node (with the <id> of 1920) was updated with the new value for birthPlace.

UpdateToUK

Using MERGE to create relationships

Using MERGE to create relationships is expensive. Only do it when you need to ensure that a relationship is unique and you are not sure if it already exists.

In this example, we use the MATCH clause to find all Person nodes that represent Michael Caine and we find the movie, Batman Begins that we want to connect to all of these nodes. We already have a connection between one of the Person nodes and the Movie node. We do not want this relationship to be duplicated. This is where we can use MERGE as follows:

MATCH (p:Person), (m:Movie)
WHERE m.title = 'Batman Begins' AND p.name ENDS WITH 'Caine'
MERGE (p)-[:ACTED_IN]->(m)
RETURN p, m

Here is the result of executing this Cypher statement. It went through all the nodes and added the relationship to the nodes that didn’t already have the relationship.

MergeRelationships

You must be aware of the behavior of the MERGE clause and how it will automatically create nodes and relationships. MERGE tries to find a full pattern and if it doesn’t find it, it creates that full pattern. That’s why in most cases you will first MERGE your nodes and then your relationship afterwards.

Only if you intentionally want to create a node within the context of another (like a month within a year) then a MERGE pattern with one bound and one unbound node makes sense.

For example:

MERGE (fromDate:Date {year: 2018})<-[:IN_YEAR]-(toDate:Date {month: 'January'})

Exercise 11: Merging Data in the graph

In the query edit pane of Neo4j Browser, execute the browser command:

:play intro-neo4j-exercises

and follow the instructions for Exercise 11.

Check your understanding

Question 1

What Cypher clauses can you use to create a node?

Select the correct answers.

  • CREATE

  • CREATE NODE

  • MERGE

  • ADD

Question 2

Suppose that you have retrieved a node, s with a property, color:

MATCH (s:Shape {location: [20,30]})
???
RETURN s

What Cypher clause do you add here to delete the color property from this node?

Select the correct answers.

  • DELETE s.color

  • SET s.color=null

  • REMOVE s.color

  • SET s.color=?

Question 3

Suppose you retrieve a node, n in the graph that is related to other nodes. What Cypher clause do you write to delete this node and its relationships in the graph?

Select the correct answer.

  • DELETE n

  • DELETE n WITH RELATIONSHIPS

  • REMOVE n

  • DETACH DELETE n

Summary

You can now write Cypher statements to:

  • Create a node

    • Add and remove node labels

    • Add and remove node properties

    • Update properties

  • Create a relationship

    • Add and remove properties for a relationship

  • Delete a node

  • Delete a relationship

  • Merge data in a graph

    • Creating nodes

    • Creating relationships