Defining a schema

Example graph

First create some data to use for our examples:

CREATE (forrestGump:Movie {title: 'Forrest Gump', released: 1994})
CREATE (robert:Person:Director {name: 'Robert Zemeckis', born: 1951})
CREATE (tom:Person:Actor {name: 'Tom Hanks', born: 1956})
CREATE (tom)-[:ACTED_IN {roles: ['Forrest']}]->(forrestGump)
CREATE (robert)-[:DIRECTED]->(forrestGump)

This is the resulting graph:

cypher intro schema data arr

Using indexes

The main reason for using indexes in a graph database is to find the starting point of a graph traversal. Once that starting point is found, the traversal relies on in-graph structures to achieve high performance.

Indexes can be added at any time.

If there is existing data in the database, it will take some time for an index to come online.

The following query creates an index to speed up finding actors by name in the database:

CREATE INDEX example_index_1 FOR (a:Actor) ON (a.name)

In most cases it is not necessary to specify indexes when querying for data, as the appropriate indexes will be used automatically.

It is possible to specify which index to use in a particular query, using index hints. This is one of several options for query tuning, described in detail in Cypher® manual → Query tuning.

For example, the following query will automatically use the example_index_1:

MATCH (actor:Actor {name: 'Tom Hanks'})
RETURN actor

A composite index is an index on multiple properties for all nodes that have a particular label. For example, the following statement will create a composite index on all nodes labeled with Actor and which have both a name and a born property. Note that since the node with the Actor label that has a name of "Keanu Reeves" does not have the born property. Therefore that node will not be added to the index.

CREATE INDEX example_index_2 FOR (a:Actor) ON (a.name, a.born)

You can query a database with SHOW INDEXES to find out what indexes are defined.

SHOW INDEXES YIELD name, labelsOrTypes, properties, type
Rows: 2

+----------------------------------------------------------------+
| name              | labelsOrTypes | properties       | type    |
+----------------------------------------------------------------+
| 'example_index_1' | ['Actor']     | ['name']         | 'BTREE' |
| 'example_index_2' | ['Actor']     | ['name', 'born'] | 'BTREE' |
+----------------------------------------------------------------+

Learn more about indexes in Cypher Manual → Indexes.

Using constraints

Constraints are used to make sure that the data adheres to the rules of the domain. For example:

"If a node has a label of Actor and a property of name, then the value of name must be unique among all nodes that have the Actor label".

Example 1. Uniqueness constraint

This example shows how to create a constraint for nodes that have the label Movie and the property title. The constraint specifies that the title property must be unique.

Adding the unique constraint will implicitly add an index on that property. If the constraint is dropped, but the index is still needed, the index will have to be created explicitly.

CREATE CONSTRAINT constraint_example_1 FOR (movie:Movie) REQUIRE movie.title IS UNIQUE

The syntax was changed in Neo4j 4.4, the old syntax is:

CREATE CONSTRAINT constraint_example_1 ON (movie:Movie) ASSERT movie.title IS UNIQUE Deprecated

Constraints can be added to database that already has data in it. This requires that the existing data complies with the constraint that is being added.

You can query a database to find out what constraints are defined with the SHOW CONSTRAINTS Cypher syntax.

Example 2. Constraints query

This example shows a Cypher query that returns the constraints that has been defined for the database.

SHOW CONSTRAINTS YIELD id, name, type, entityType, labelsOrTypes, properties, ownedIndexId
Rows: 1

+-----------------------------------------------------------------------------------------------------+
| id | name                   | type         | entityType | labelsOrTypes | properties | ownedIndexId |
+-----------------------------------------------------------------------------------------------------+
| 4  | 'constraint_example_1' | 'UNIQUENESS' | 'NODE'     | ['Movie']     | ['title']  | 3            |
+-----------------------------------------------------------------------------------------------------+

The constraint described above is available for all editions of Neo4j. Additional constraints are available for Neo4j Enterprise Edition.

Learn more about constraints in Cypher manual → Constraints.