You have learned how to define both uniqueness and existence constraints for your data. A uniqueness constraint is also an index where the default implementation is a b-tree structure. Next, you will learn about creating property and full-text schema indexes in the graph.
At the end of this module, you should be able to:
Describe when indexes are used in Cypher.
Create a single property index.
Create a multi-property index.
Create a full-text schema index.
Use a full-text schema index.
Drop an index.
Drop a full-text schema index.
|Because the code examples in this lesson modify the database, it is recommended that you do not execute them against your database as you will be doing so in the hands-on exercises.|
The uniqueness and node key constraints that you add to a graph are essentially single-property and composite indexes respectively. Indexes are used to improve initial node lookup performance, but they require additional storage in the graph to maintain and also add to the cost of creating or modifying property values that are indexed. Indexes store redundant data that points to nodes with the specific property value or values. Unlike SQL, there is no such thing as a primary key in Neo4j. You can have multiple properties on nodes that must be unique.
Single-property indexes are used for:
Spatial distance searches
Spatial bounding searches
Composite indexes are used only for equality checks and list membership.
In this course, we introduce the basics of Neo4j b-tree indexes, but you should consult the Neo4j Operations Manual for more details about creating and maintaining indexes.
|Because index maintenance incurs additional overhead when nodes are created, Neo4j recommends that for large graphs, indexes are created after the data has been loaded into the graph.|
When you add an index for a property of a node, it can greatly reduce the number of nodes the graph engine needs to visit in order to satisfy a query.
In this query we are testing the value of the released property of a Movie node using ranges:
MATCH (m:Movie) WHERE 1990 < m.released < 2000 SET m.videoFormat = 'DVD'
If there is an index on the released property, the graph engine will find the pointers to all nodes that satisfy the query without having to visit all of the nodes:
You create an index to improve graph engine performance. A uniqueness constraint on a property is an index so you need not create an index for any properties you have created uniqueness constraints for. An index on its own does not guarantee uniqueness.
Here is an example of how we create a single-property index on the released property of all nodes of type Movie:
CREATE INDEX MovieReleased FOR (m:Movie) ON (m.released)
Notice that just as for constraints, a best practice is to specify a name for the index. In this case, the name is MovieReleased.
With the result:
If a set of properties for a node must be unique for every node, then you should create a constraint as a node key, rather than an index.
If, however, there can be duplication for a set of property values, but you want faster access to them, then you can create a composite index. A composite index is based upon multiple properties for a node.
Suppose we added the property, videoFormat to every Movie node and set its value, based upon the released date of the movie as follows:
MATCH (m:Movie) WHERE m.released >= 2000 SET m.videoFormat = 'DVD'; MATCH (m:Movie) WHERE m.released < 2000 SET m.videoFormat = 'VHS'
With the result:
Notice that in the above Cypher statements we use the semi-colon
A full-text schema index is based upon string values only, but they provide additional search capabilities that you do not get from property indexes. A full-text schema index can be used for:
Node or relationship properties.
Single property or multiple properties.
Single or multiple types of nodes (labels).
Single or multiple types of relationships.
Rather than using Cypher syntax to create a full-text schema index, you call a procedure to create the index. The index is not used implicitly when you query the graph. You must call a procedure to start a query that uses the index. By default, the underlying implementation of a full-text schema index is Lucene. You can change the underlying index provider of any index.
Here is an example where we create a full-text schema index on data in title property of Movie nodes and data in the name property of Person nodes:
CALL db.index.fulltext.createNodeIndex( 'MovieTitlePersonName',['Movie', 'Person'], ['title', 'name'])
The result returned shows nothing exceptional:
After creating a full-text schema index, you can always get of listing of all existing indexes:
And here we see our newly-created full-text schema index:
Just as you can create a full-text schema index on properties of nodes, you can create a full-text schema index on properties of relationships.
To do this you use
To use a full-text schema index, you must call the query procedure that uses the index.
Here is an example where we want to find all movies and person names that contain the string Jerry:
CALL db.index.fulltext.queryNodes( 'MovieTitlePersonName', 'Jerry') YIELD node RETURN node
Notice that we specify
YIELD after calling the procedure. This enables us to use return values from the procedure.
In this case, we return all nodes that are found in the graph that have either a title property or name property containing the string, Jerry.
And here is the result:
When a full-text schema index is used, it calculates a "hit score" that represents the closeness of the values in the graph to the query string.
Here is an example:
CALL db.index.fulltext.queryNodes( 'MovieTitlePersonName', 'Matrix') YIELD node, score RETURN node.title, score
The nodes returned have a Lucene score based upon how much of Matrix was part of the title:
With full-text indexes created, you can also specify which property you want to search for. Here is an example where we are looking for Jerry, but only as a name property of a Person node:
CALL db.index.fulltext.queryNodes( 'MovieTitlePersonName', 'name: Jerry') YIELD node RETURN node
Here is what is returned:
Please see the Cypher Reference Manual for more on using full-text schema indexes.
You have already seen the three types of indexes in our database thus far using this Cypher statement:
Here is what is returned:
To drop an index on a property, you simply use the
DROP INDEX clause, specifying the name of the index:
DROP INDEX MovieReleasedVideoFormat
With the result:
To drop a full-text schema index, you must call the procedure. Here we drop the index that we created earlier:
With the result:
In the query edit pane of Neo4j Browser, execute the browser command:
and follow the instructions for Exercise 14.
|This exercise has 7 steps. Estimated time to complete: 30 minutes.|
What Cypher code below will create a unique index on the name property of the Person node?
Select the correct answer.
CREATE INDEX PersonNameIndex FOR (p:Person) ON (p.name)
CREATE INDEX PersonNameIndex FOR (p:Person) ON (p.name) ASSERT p.name IS UNIQUE
CREATE CONSTRAINT PersonNameConstraint ON (p:Person) ASSERT p.name IS UNIQUE
CALL db.index.full-text.createNodeIndex('PersonName',['Person'], ['name'])
What makes creating a full-text schema index different from creating a property index?
Select the correct answers.
Full-text schema indexes can use relationship properties.
Full-text schema indexes can check for uniqueness.
Full-text schema indexes can use multiple types of nodes for the index.
Full-text schema indexes can be used to ensure the existence of a property.
You should now be able to:
Describe when indexes are used in Cypher
Create a single property index
Create a multi-property index
Create a full-text schema index
Use a full-text schema index
Drop an index
Drop a full-text schema index
Need help? Ask in the Neo4j Community