Defining Constraints for your Data
About this module
In the courses, Querying with Cypher in Neo4j 4.x and Creating Nodes and Relationships in Neo4j 4.x, you learned how to query the graph and create nodes and relationships in the graph. In most graphs, you will want to provide uniqueness for some properties of the nodes or relationships in the graph. In addition, you may want to enforce the creation of mandatory properties for selected nodes and relationships. These are called constraints in a Neo4j graph data model and are typically defined during the graph data modeling phase of your application developement.
In Neo4j 4.3 Enterprise Edition and later, you can create constraints on relationship properties. |
At the end of this module, you will be able to:
-
Create a uniqueness constraint for a node property in the graph.
-
Create an existence constraint for a node property in the graph.
-
Create a uniqueness constraint for a set of node properties in the graph.
-
Manage constraints in the graph.
Because the code examples in this lesson modify the database, it is recommended that you do not execute them against your database as you will be doing so in the hands-on exercises. |
Uniqueness and existence in the graph
You have seen that it is possible to create duplicate nodes in the graph.
In most graphs, you will want to prevent duplication of data. Unfortunately, you cannot prevent duplication by checking the existence of the exact node (with properties) as this type of test is not cluster or multi-thread safe as no locks are used. This is one reason why MERGE
is preferred over CREATE
, because MERGE
does use locks.
In addition, you have learned that a node or relationship need not have a particular property. What if you want to ensure that all nodes or relationships of a specific type (label) must set values for certain properties?
A third scenario with graph data is where you want to ensure that a set of property values for nodes of the same type, have a unique value. This is the same thing as a primary key in a relational database.
All of these scenarios are common in many graphs.
In Neo4j, you can use Cypher to:
-
Add a uniqueness constraint that ensures that a value for a property is unique for all nodes of that type.
-
Add an existence constraint that ensures that when a node or relationship is created or modified, it must have certain properties set.
-
Add a node key that ensures that a set of values for properties of a node of a given type is unique.
Constraints and node keys that enforce uniqueness are related to indexes which you will learn about later in this course.
Existence constraints and node keys are only available in Enterprise Edition of Neo4j. |
Ensuring that a property value for a node is unique
You add a uniqueness constraint to the graph by creating a constraint that asserts that a particular node property is unique in the graph for a particular type of node.
Here is an example for ensuring that the title for a node of type Movie is unique:
CREATE CONSTRAINT UniqueMovieTitleConstraint ON (m:Movie) ASSERT m.title IS UNIQUE
Although the name of the constraint, UniqueMovieTitleConstraint is optional, Neo4j recommends that you name it. Otherwise, it will be given an auto-generated name. This Cypher statement will fail if the graph already has multiple Movie nodes with the same value for the title property. Note that you can create a uniqueness constraint, even if some Movie nodes do not have a title property.
Here is the result of running this Cypher statement on the Movie graph:
And if we attempt to create a Movie with the title, The Matrix, the Cypher statement will fail because the graph already has a movie with that title:
CREATE (:Movie {title: 'The Matrix'})
Here is the result of running this Cypher statement on the Movie graph:
Ensuring that properties exist
Having uniqueness for a property value is only useful in the graph if the property exists. In most cases, you will want your graph to also enforce the existence of properties, not only for those node properties that require uniqueness, but for other nodes and relationships where you require a property to be set. Uniqueness constraints can only be created for nodes, but existence constraints can be created for node or relationship properties.
You add an existence constraint to the graph by creating a constraint that asserts that a particular type of node or relationship property must exist in the graph when a node or relationship of that type is created or updated.
Recall that in the Movie graph, the movie, Something’s Gotta Give has no tagline property:
MATCH (m:Movie)
WHERE m.title CONTAINS 'Gotta'
RETURN m
Example: Existence constraint addition fails
Here is an example for adding the existence constraint to the tagline property of all Movie nodes in the graph:
CREATE CONSTRAINT ExistsMovieTagline ON (m:Movie) ASSERT exists(m.tagline)
Here is the result of running this Cypher statement:
The constraint cannot be added to the graph because a node has been detected that violates the constraint.
In Neo4j 4.3, exists() has been deprecated and may not work in future releases, but it still works in 4.3. In Neo4j 4.3 and later you can use CREATE CONSTRAINT ExistsMovieTagline ON (m:Movie) ASSERT m.tagline IS NOT NULL .
|
Example: Adding the existence constraint
We know that in the Movie graph, all :REVIEWED relationships currently have a property, rating. We can create an existence constraint on that property as follows:
CREATE CONSTRAINT ExistsREVIEWEDRating
ON ()-[rel:REVIEWED]-() ASSERT exists(rel.rating)
Notice that when you create the constraint on a relationship, you need not specify the direction of the relationship. With the result:
Attempting to add relationship without property
So after creating this constraint, if we attempt to create a :REVIEWED relationship without setting the rating property:
MATCH (p:Person), (m:Movie)
WHERE p.name = 'Jessica Thompson' AND
m.title = 'The Matrix'
MERGE (p)-[:REVIEWED {summary: 'Great movie!'}]->(m)
We see this error:
Attempting to remove property from relationship
You will also see this error if you attempt to remove a property from a node or relationship where the existence constraint has been created in the graph.
MATCH (p:Person)-[rel:REVIEWED]-(m:Movie)
WHERE p.name = 'Jessica Thompson'
REMOVE rel.rating
Here is the result:
Retrieving constraints defined for the graph
You can query for the set of constraints defined in the graph as follows:
CALL db.constraints()
In Neo4j 4.2 and later you can use SHOW CONSTRAINTS .
|
And here is what is returned from the graph:
Dropping constraints
You remove constraints defined for the graph with the DROP CONSTRAINT
clause.
Here we drop the existence constraint named ExistsREVIEWEDRating:
DROP CONSTRAINT ExistsREVIEWEDRating
With the result:
Creating multi-property uniqueness/existence constraint: node key
A node key is used to define the uniqueness and existence constraint for multiple properties of a node of a certain type. A node key is also used as a composite index in the graph.
Suppose that in our Movie graph, we will not allow a Person node to be created where both the name and born properties are the same. We can create a constraint that will be a node key to ensure that this uniqueness for the set of properties is asserted.
Here is an example to create this node key:
CREATE CONSTRAINT UniqueNameBornConstraint
ON (p:Person) ASSERT (p.name, p.born) IS NODE KEY
Here is the result of running this Cypher statement on our Movie graph:
This attempt to create the constraint failed because there are Person nodes in the graph that do not have the born property defined.
Cleaning up the graph to support constraint
If we set these properties for all nodes in the graph that do not have born properties with:
MATCH (p:Person)
WHERE NOT exists(p.born)
SET p.born = 0
With this result:
Then the creation of the node key succeeds:
Any subsequent attempt to create or modify an existing Person node with name or born values that violate the uniqueness constraint as a node key will fail.
Testing the node key - duplicate data
For example, executing this Cypher statement will fail:
CREATE (:Person {name: 'Jessica Thompson', born: 0})
Here is the result:
If you have defined a node key in the graph that, for example, represents the data in two properties, every node must contain a unique value for the properties. Additionally, every node must contain the properties of the node key.
Exercise 13: Defining constraints on your data
Prior to performing this exercise, set up your development environment to use one of the following, which is covered in the course, Overview of Neo4j 4.x. |
-
Neo4j Desktop
-
Neo4j Sandbox
-
Neo4j Aura
In the query edit pane of Neo4j Browser, execute the browser command:
:play 4.0-intro-neo4j-exercises
and follow the instructions for Exercise 13.
This exercise has 14 steps. Estimated time to complete: 30 minutes. |
Check your understanding
Question 1
What are some of the constraints you can create for the data in your graph?
Select the correct answers.
-
Property for a node with a given label is always a string value.
-
Property value for a node with a given label is unique.
-
Property for a node with a given label must exist.
-
Property value for a relationship is unique.
Question 2
What types of uniqueness constraints can you define for a graph?
Select the correct answers.
-
Unique values for a property of a node
-
Unique values for a property of a relationship
-
Unique values for a set of properties of a node
-
Unique values for a set of properties of a relationship
Question 3
Which statements below are true about node keys?
Select the correct answers.
-
A node key does not require you to define a label for the node.
-
A node key requires you to define a label for the node.
-
You do not have to specify properties when you define a node key.
-
You can specify an unlimited number of properties when you define a node key.
Summary
You can now:
-
Create a uniqueness constraint for a node property in the graph.
-
Create an existence constraint for a node property in the graph.
-
Create a uniqueness constraint for a set of node properties in the graph.
-
Manage constraints in the graph.
Need help? Ask in the Neo4j Community